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Preface 



Though quantum theory is celebrating its 100th anniversary this year, quan- 
tum information processing is still a remarkably young research field. The 
questions driving this research field reflect a profound change in the gen- 
eral attitude towards the fundamental aspects of quantum theory. So far, 
research on the foundations of quantum theory has been concerned mainly 
with the theoretical exploration of those particular features which distin- 
guish quantum theory from classical physics. The main intention of quantum 
information processing is to exploit these specific features for technological 
purposes. As early as 1935, Erwin Schrodinger had already noted that one of 
these characteristic features of quantum theory is the phenomenon of entan- 
glement. Many years passed from this early insight until John Bell realized 
the quantitative consequences of the corresponding quantum correlations in 
his famous work from 1964. These theoretical predictions inspired numerous 
experiments, which all support the peculiar features predicted for quantum 
correlations. Prom these purely theoretical insights, it again required a long 
period of development to arrive at those potentially useful applications which 
are now of central interest for the processing of quantum information. 

The following contributions provide an introductory overview of basic 
problems, methods and topical results in this research field. The idea of pro- 
ducing this volume was born at a symposium on this subject which was held 
at the 1999 annual spring meeting of the Deutsche Physikalische Gesellschaft 
in Heidelberg. This symposium was organized jointly by the Quantum Op- 
tics and Mathematical Physics sections. The widespread interest, the success 
of this symposium and the initiative of Prof. Frank Steiner, the head of the 
Mathematical Physics section, motivated us to edit a volume on basic prob- 
lems, methods and recent results in this rapidly evolving field. This book 
should be useful for students and active researchers in physics, computer sci- 
ence and mathematics who want to learn about the most recent developments 
in this exciting research field. 



Ulm, March 2001 



Gemot Alber 
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1 From the Foundations of Quantum Theory 
to Quantum Technology — an Introduction 



Gernot Alber 



Nowadays, the new technological prospects of processing quantum informa- 
tion in quantum cryptography [1], quantum computation [2] and quantum 
communication [3] attract not only physicists but also researchers from other 
scientific communities, mainly computer scientists, discrete mathematicians 
and electrical engineers. Current developments demonstrate that character- 
istic quantum phenomena which appear to be surprising from the point of 
view of classical physics may enable one to perform tasks of practical interest 
better than by any other known method. In quantum cryptography, the no- 
cloning property of quantum states [ i] or the phenomenon of entanglement 
[5] helps in the exchange of secret keys between various parties, thus en- 
suring the security of one-time-pad cryptosystems [6]. Quantum parallelism 
[7], which relies on quantum interference and which typically also involves 
entanglement [8], may be exploited for accelerating computations. Quantum 
algorithms are even capable of factorizing numbers more efficiently than any 
known classical method is [9] , thus challenging the security of public-key cryp- 
tosystems such as the RSA system [6]. Classical information and quantum 
information based on entangled quantum systems can be used for quantum 
communication purposes such as teleporting quantum states [10, 11]. 

Owing to significant experimental advances, methods for processing quan- 
tum information have developed rapidly during the last few years. ^ Basic 
quantum communication schemes have been realized with photons [10, 11], 
and basic quantum logical operations have been demonstrated with trapped 
ions [13, 11] and with nuclear spins of organic molecules [15]. Also, cavity 
quantum electrodynamical setups [16], atom chips [17], ultracold atoms in 
optical lattices [18, 19], ions in an array of microtraps [20] and solid-state 
devices [21, 22, 23] are promising physical systems for future developments 
in this research area. All these technologically oriented, current developments 
rely on fundamental quantum phenomena, such as quantum interference, the 
measurement process and entanglement. These phenomena and their distinc- 
tive differences from basic concepts of classical physics have always been of 
central interest in research on the foundations of quantum theory. However, 
in emphasizing their technological potential, the advances in quantum infor- 

^ Numerous recent experimental and theoretical achievements are discussed in [12]. 
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mation processing reflect a profound change in the general attitude towards 
these fundamental phenomena. Thus, after almost two decades of impressive 
scientific achievements, it is time to retrace some of those significant early de- 
velopments in quantum physics which are at the heart of quantum technology 
and which have shaped its present-day appearance. 



1.1 Early Developments 

Many of the current methods and developments in the processing of quantum 
information have grown out of a long struggle of physicists with the foun- 
dations of modern quantum theory. The famous considerations by Einstein, 
Podolsky and Rosen (EPR) [24] on reality, locality and completeness of phys- 
ical theories are an early example in this respect. The critical questions raised 
by these authors inspired many researchers to study quantitatively the essen- 
tial difference between quantum physics and the classical concepts of reality 
and locality. The breakthrough was the discovery by J.S. Bell [25] that the 
statistical correlations of entangled quantum states are incompatible with the 
predictions of any theory which is based on the concepts of reality and lo- 
cality of EPR. The constraints imposed on statistical correlations within the 
framework of a local, realistic theory (LRT) are expressed by Bell’s inequality 
[25]. As the concept of entanglement and its peculiar correlation properties 
have been of fundamental significance for the development of quantum infor- 
mation processing, it is worth recalling some of its most elementary features 
in more detail. 

1.1.1 Entanglement and Local, Realistic Theories 

In order to clarify the characteristic differences between quantum mechan- 
ical correlations originating from entangled states and classical correlations 
originating from local, realistic theories, let us consider the following basic 
experimental setup (Fig. 1.1). A quantum mechanical two-particle system. 




Fig. 1.1. Basic experimental setup for testing Bell’s inequality; the choices of the 
directions of polarization on the Bloch sphere for optimal violation of the CHSH 
inequality (1.3) correspond to tp = tt/ 4 for spin-1/2 systems 
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such as a photon pair, is produced by a source s. Polarization properties of 
each of these particles are measured subsequently by two distant observers A 
and B. Observers A and B perform polarization measurements by randomly 
selecting one of the directions cti or 0 : 2 , and (3i or (32, respectively, in each 
experiment. Furthermore, let us assume that for each of these directions only 
two measurement results are possible, namely +1 or —1. In the case of pho- 
tons these measurement results would correspond to horizontal or vertical 
polarization. 

What are the restrictions imposed on correlations of the measurement 
results if the physical process can be described by an underlying LRT with 
unknown (hidden) parameters? For this purpose, let us first of all summarize 
the minimal set of conditions any LRT should fulfill. 

1 . The state of the two-particle system is determined uniquely by a parame- 
ter A, which may denote an arbitrary set of discrete or continuous labels. 
Thus the most general observable of observer A or B for the experimental 
setup depicted in Fig. 1.1 is a function of the variables (a^, (3j, A). If the 
actual value of the parameter A is unknown (hidden), the state of the 
two-particle system has to be described by a normalized probability dis- 
tribution P(A), i.e. /^dAP(A) = 1, where A characterizes the set of all 
possible states. The state variable A determines all results of all possible 
measurements, irrespective of whether these measurements are performed 
or not. It represents the element of physical reality inherent in the ar- 
guments of EPR: “If, without in any way disturbing a system, we can 
predict with certainty the value of a physical quantity, then there exists 
an element of physical reality corresponding to this physical quantity” 
[24]. 

2. The measurement results of each of the distant (space-like separated) 
observers are independent of the choice of polarizations of the other ob- 
server. This assumption reflects the locality concept inherent in the argu- 
ments of EPR: “The real factual situation of the system A is independent 
of what is done with the system B, which is spatially separated from the 
former” [24]. Thus, taking into account also this locality requirement, the 
most general observable of observer A for the experimental setup depicted 
in Fig. 1.1 can depend on the variables oti and A (for B, (3j and A) only. 

These two assumptions, which reflect fundamental notions of classical physics 
as used in the arguments of EPR, restrict significantly the possible cor- 
relations of measurements performed by both distant observers. Accord- 
ing to these assumptions, the following measurement results are possible: 
a{ai,X) = tti = ±l {i = 1,2) for observer A, and b{(3^,X) = bi = ±1 
{i = 1,2) for observer B. For a given value of the state variable A, all these 
possible measurement results of the dichotomic (two- valued) variables and 
bi (i = 1, 2) can be combined in the single relation 



|(ai -I- 02)^1 -I- (fl2 - 01)62] = 2 . 



( 1 . 1 ) 
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It should be mentioned that this relation is coimterfactual [26] in the sense 
that it involves both results of actually performed measurements and pos- 
sible results of unperformed measurements. All these measurement results 
are determined uniquely by the state variable A. If this state variable is un- 
known (hidden), (1.1) has to be averaged over the corresponding probability 
distribution P(A). This yields an inequality for the statistical mean values, 



which is a variant of Bell’s inequality and which is due to Clauser, Horne, 
Shimony and Holt (CHSH) [27], namely 



This inequality characterizes the restrictions imposed on the correlations be- 
tween dichotomic variables of two distant observers within the framework of 
any LRT. There are other, equivalent forms of Bell’s inequality, one of which 
was proposed by Wigner [28] and will be discussed in Chap. 3. 

Quantum mechanical correlations can violate this inequality. For this pur- 
pose let us consider, for example, the spin-entangled singlet state 



where | ± 1)a and | ± 1)b denote the eigenstates of the Pauli spin operators 
and cr®, with eigenvalues ±1. Quantum mechanically, the measurement 
of the dichotomic polarization variables and bi is represented by the spin 
operators a,i = oti-cr^ and hi = j3i-cr^ . (cr^, for example, denotes the vector of 
Pauli spin operators referring to observer A, i.e. cr^ = J2i=x y z where 

are the unit vectors.) The corresponding quantum mechanical correlations 
entering the CHSH inequality (1.3) are given by 

(aibj)QM = {i^\aibj\ip) = -cXi ■ 13 ^ . (1.5) 

Choosing the directions of the polarizations {f3i,a.2), ( 0 : 2 , /32) on 

the Bloch sphere so that they involve an angle of 7 t/ 4 (see Fig. 109), one finds 
a maximal violation of inequality (1.3), namely 

I (ai6i)qM + («2 &i)qm + (a2^2)qM — (oi 62 )qM | = 2\/2 > 2 . (1.6) 

Thus, for this entangled state, the quantum mechanical correlations between 
the measurement results of the distant observers A and B are stronger than 
any possible correlation within the framework of an LRT. Obviously, these 
correlations are incompatible with the classical notions of reality and local- 
ity of any LRT. It is these peculiar quantum correlations originating from 
entanglement which have been of central interest in research on the founda- 
tions of quantum theory and which are also of central interest for quantum 
information processing. 




( 1 . 2 ) 



I (ai^l)LRT + (a2&l)LRT + (o2&2)lRT ~ (ai^2)LRT I < 2 . 



(1.3) 



|V^) = ^(| + 1)a|-1)b-|-1)a| + 1)b) 



(1.4) 
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So far, numerous experiments testing and supporting violations of Bell’s 
inequality [29, 30, 31] have been performed.^ However, from a strictly logical 
point of view, the results of all these experiments could still be explained 
by an LRT, owing to two loopholes, namely the locality and the detection 
loopholes. The locality loophole concerns violations of the crucial locality as- 
sumption underlying the derivation of Bell’s inequality. According to this as- 
sumption one has to ensure that any signaling between two distant observers 
A and B is impossible. The recently performed experiment of G. Weihs et 
al. [31] succeeded in fulfilling this locality requirement by choosing the sep- 
aration between these observers to be sufficiently large. However, so far all 
experiments have involved low detection efficiencies, so that in principle the 
observed correlations which violate Bell’s inequality can still be explained by 
an LRT [32, 33]. This latter detection loophole constitutes a major experi- 
mental challenge, and it is one of the current experimental aims to close both 
the detection loophole and the locality loophole simultaneously [34, 35, 36]. 

The concepts of physical reality and locality which lead to inequality (1.3) 
can also lead to logical contradictions with quantum theory which are not of 
statistical origin. This becomes particularly apparent when one considers an 
entangled three-particle state of the form 

IV’)ghz = + 1)a| + 1)b| + 1)c - I - 1)a| - 1)b| - l)c) , (1-7) 

a so-called Greenberger-Horne-Zeilinger (GHZ) state [37]. Again j ± 1 )a, 

1 ± 1)b, and ] ± l)c denote the eigenstates of the Pauli spin operators cr^, 
af, and cr^, with eigenvalues ±1. Similarly to Fig. 109, let us assume that 
the polarization properties of this entangled quantum state are investigated 
by three distant (space-like separated) observers A, B and G. Each of these 
observers chooses his or her direction of polarization randomly along either 
the X or the y axis. 

What are the consequences an LRT would predict? As the three observers 
are space-like separated, the locality assumption implies that a polarization 
measurement by one of these observers cannot influence the results of the 
other observers. Following the notation of Fig. 109, the possible results of the 
polarization measurements of observers A, B and G along directions oti, (3j 
and are Oi = ±1, bj = ±1 and Ck = ±1. Let us now consider four pos- 
sible coincidence measurements of these three distant observers, with results 
{ax,bx,Cx), {ax,by,Cy), {ay,bx,Cy) and {ay,by,Cx)- As we are dealing with 
dichotomic variables, within an LRT the product of all these measurement 
results is always given by 

??LRT — {,OjxbxC-x){,^xbyCy'){^(XybxCy'){^(lybyCx) — ^x^x^x^y^y^y — ^ * (^-S) 

What are the corresponding predictions of quantum theory? In quantum 
theory the variables a^, bj and Ck are replaced by the Pauli spin operators 

^ For a comprehensive discussion of experiments performed before 1989, see [29]. 
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hi = OLi ■ cr^, bj = f3j ■ cr® and Ck = 7 fc • cr^- The GHZ state of (1.7) fulfills 
the relations 

hxbxCx\i^)Giiz = — |V’)ghz , 

hxbyCy\il^) GHZ = hybxCy\ip)GHZ = hybyCx\ip) GHZ = |V')ghz ■ (1.9) 

Therefore the quantum mechanical result for the product of (1.8) is given by 

77qm|V^)ghz ~ {hxbxCx)(^hxbyCy)(^hybxCy)(^hybyCx)\'ip}GHZ 

= (-1)Ii/')ghz (1-10) 

and contradicts the corresponding result of an LRT. These peculiar quantum 
mechanical predictions have recently been observed experimentally [38] . The 
entanglement inherent in these states offers interesting perspectives on the 
possibility of distributing quantum information between three parties [39] . 



1.1.2 Characteristic Quantum Effects for Practical Purposes 

According to a suggestion of Feynman [10], quantum systems are not only of 
interest for their own sake but might also serve specific practical purposes. 
Simple quantum systems may be used, for example, for simulating other, more 
complicated quantum systems. This early suggestion of Feynman emphasizes 
possible practical applications and thus indicates already a change in the 
attitude towards characteristic quantum phenomena. 

In the same spirit, but independently, Wiesner suggested in the 1960s the 
use of nonorthogonal quantum states for the practical purpose of encoding 
secret classical information [41].^ The security of such an encoding procedure 
is based on a characteristic quantum phenomenon which does not involve 
entanglement, namely the impossibility of copying (or cloning) nonorthogonal 
quantum states [4]. This impossibility becomes apparent from the following 
elementary consideration. Let us imagine a quantum process which is capable 
of copying two nonorthogonal quantum states, say jO) and jl), with 0 < 
j(0jl)j < 1. This process is assumed to perform the transformation 

10) l(^)la)^l0)]0)lao), 

11) l(^)la)^]l)ll)K), (1.11) 

where ](/?) represents the initial quantum state of the (empty) copy and 
jo), ]ao), ]ai) denote normalized quantum states of an ancilla system. This 
ancilla system describes the internal states of the copying device. As this 
copying process has to be unitary, it has to conserve the scalar product be- 
tween the two input and the two output states. This implies the relation 
(0jl)(l — (Ojl)(ao]ai)) = 0. This equality can be fulfilled only if either states 

Though this article was written in the 1960s, it was not published until 1983. 



3 



1 From the Foundations of Quantum Theory to Quantum Technology 7 

|0) and |1) are orthogonal, i.e. (0|1) = 0, or if (0|1) = 1 = (ao|ai). Both pos- 
sibilities contradict the original assumption of nonorthogonal, nonidentical 
initial states. Therefore a quantum process capable of copying nonorthogo- 
nal quantum states is impossible. This is an early example of an impossible 
quantum process. 

Soon afterwards, Bennett and Brassard [42] proposed the first quan- 
tum protocol (BB84) for secure transmission of a random, secret key using 
nonorthogonal states of polarized photons for the encoding (see Table 1.1). 
In the Vernam cipher, such a secret key is used for encoding and decoding 
messages safely [6, 43]. In this latter encoding procedure the message and 
the secret key are added bit by bit, and in the decoding procedure they are 
subtracted again. If the random key is secret, the safety of this protocol is 
guaranteed provided the key is used only once, has the same length as the 
message and is truly random [44] . Nonorthogonal quantum states can help in 
transmitting such a random, secret key safely. For this purpose A(lice) sends 
photons to B(ob) which are polarized randomly either horizontally (-1-1) or 
vertically (—1) along two directions of polarization. It is convenient to choose 
the magnitude of the angle between these two directions of polarization to be 
7t/8. B(ob) also chooses his polarizers randomly to be polarized along these 
directions. After A(lice) has sent all photons to B(ob), both communicate to 
each other their choices of directions of polarization over a public channel. 
However, the sent or measured polarizations of the photons are kept secret. 
Whenever they chose the same direction (yes), their measured polarizations 
are correlated perfectly and they keep the corresponding measured results 
for their secret key. The other measurement results (no) cannot be used for 
the key. Provided the transmission channel is ideal, A(lice) and B(ob) can 
use part of the key for detecting a possible eavesdropper because in this case 
some of the measurements are not correlated perfectly. In practice, however, 
the transmission channel is not perfect and A(lice) and B(ob) have to process 
their raw key further to extract from it a secret key [15]. It took some more 



Table 1.1. Part of a possible idealized protocol for transmitting a secret key, 
according to [42] 
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B(ob)’s direction i 
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B(ob)’s measured polarization 
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No 
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Secret key 






-1 




+1 


+1 


-1 




-1 


+1 





years to realize that an exchange of secret keys can be achieved with the 
help of entangled quantum states [4o]. Thereby, the characteristic quantum 
correlations of entangled states and the very fact that they are incompat- 
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ible with any LRT can be used for ensuring security of the key exchange. 
After the first proof-of-principle experiments [47, 48], the first practical im- 
plementation of quantum cryptography over a distance of about 4 km was 
realized at the University of Geneva using single, polarized photons trans- 
mitted through an optical fiber [49] . These developments launched the whole 
new field of quantum cryptography. Now, this field represents the most devel- 
oped part of quantum information processing. Quantum cryptography based 
on the BB84 protocol has already been realized over a distance of 23 km 
[50]. Recent experiments [30, 31] have demonstrated that photon pairs can 
also be entangled over large distances, so that entanglement-based quantum 
cryptography over such large distances might become accessible soon. Some 
of these experiments are discussed in Chap. 3. 

Simultaneously with these developments in quantum cryptography, nu- 
merous other physical processes were discovered which were either enabled 
by entanglement or in which entanglement led to an improvement of perfor- 
mance. The most prominent examples are dense coding [51], entanglement- 
assisted teleportation [10, 11, 52] and entanglement swapping [52, 53]. (These 
processes are discussed in detail in Chaps. 2 and 3.) In the spirit of Feynman’s 
suggestion, all these developments demonstrate that characteristic quantum 
phenomena have practical applications in quantum information processing. 

1.1.3 Quantum Algorithms 

Feynman’s suggestion also indicates interesting links between quantum phy- 
sics and computer science. After the demonstration [54] that quantum sys- 
tems can simulate reversible Turing machines [55], the first quantum gener- 
alization of Turing machines was developed [7]. (Turing machines are general 
models of computing devices and will be discussed in detail in Chap. 4.) Fur- 
thermore, it was pointed out that one of the remarkable properties of such 
a quantum Turing machine is quantum parallelism, by which certain tasks 
may be performed faster than by any classical computing device. Deutsch’s 
algorithm [7, 56] was the first quantum algorithm demonstrating how the 
interplay between quantum interference, entanglement and the quantum me- 
chanical measurement process could serve this practical purpose. 

The computational problem solved by Deutsch’s algorithm is the follow- 
ing. We are given a device, a so-called oracle, which computes a Boolean 
function / mapping all possible binary n-bit strings onto one single bit. 
Therefore, given a binary n-bit string x as input, this oracle can compute 
/(x) G {0, 1} in a single step. Furthermore, let us assume that this function 
is either constant or balanced. Thus, in the first case the 2"' possible input 
values of x are all mapped onto 0 or all onto I. In the second case half of 
the input values are mapped onto either 0 or I and the remaining half are 
mapped onto the other value. The problem is to develop an algorithm which 
determines whether / is constant or balanced. 
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Let us first of all discuss briefly the classical complexity of this problem. 
In order to answer the question in the worst possible case, the oracle has to be 
queried more than 2"“^ times. It can happen, for example, that the first 2"“^ 
queries all give the same result, so that at least one more query of the oracle 
is required to decide whether / is constant or balanced. Thus, classically, it 
is apparent that the number of steps required grows exponentially with the 
number of bits. 



u 



I a ® f(x)> 



Fig. 1.2. Basic operation of a quantum oracle Uf which evaluates a Boolean func- 
tion / : a; G Z 2 — > f{x) G Zj = {0, 1}; |o;} is the input state of an n-qubit quantum 
system; |a) is a one-qubit state and 0 denotes addition modulo 2 



Quantum mechanically, the situation is different. The 2” possible binary 
n-bit strings x can be represented by quantum states |x), which form a ba- 
sis in a 2"-dimensional Hilbert space which is the state space of n 
qubits. Furthermore, we imagine that the classical oracle is replaced by a 
corresponding quantum oracle (Fig. 1.2). This is a unitary transformation 
Uf which maps basis states of the form |x)|a), where a € {0,1}, to output 
states of the form |x)|a © /(x)) in a single step. Here, |a) denotes the quan- 
tum state of an ancilla qubit and © denotes addition modulo 2. If the initial 
state is |x)|0), for example, the quantum oracle performs an evaluation of 
/(x), resulting in the final state |x)|/(x)). However, as this transformation 
is unitary, it can perform this task also for any linear combination of possi- 
ble basis states in a single step. This is the key idea of quantum parallelism 
[7]. Deutsch’s quantum algorithm obtains the solution to the problem posed 
above by the following steps (Fig. 1.3): 



1. The n-qubit quantum system and the ancilla system are prepared in 
states |0) and (|0) — |l))/-\/2- Then a Hadamard transformation 



H : 



| 0 )^^(| 0 ) + | 1 )), 

|l)^i=(|0)-|l)) 



( 1 . 12 ) 
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is applied to all of the first n qubits. We denote by the application 
of H to the ith qubit. Thus, the separable quantum state 

1 " 1 

[{]I^H(^))\0)](\0)-\1)) = ^=J2 I^)(|0)-|1)) (1-13) 

^ i—1 ^ x^2‘^ 

is prepared. 

2. A single application of the quantum oracle Uf to state |V'i) yields the 
quantum state 

^ E (-1)^^"V)(|0) - |1)) . (1.14) 

a:g2" 

3. Subsequently a quantum measurement is performed to determine whether 
the system is in state \ipi) or not. With the help of n Hadamard trans- 
formations (as in step 1), this quantum measurement can be reduced to 
a measurement of whether the first n qubits of the quantum system are 
in state |0) or not. 



If in step 3 the quantum system is found in state |'0i), / is constant, otherwise 
/ is balanced. One of these two possibilities is observed with unit probability. 
The probability p of observing the quantum system in state |^i) is given by 

(1.15) 

Taking into account the single application of the quantum oracle in step 2 
and the application of the Hadamard transformations in the preparation and 
measurement processes, Deutsch’s quantum algorithm requires 0{n) steps to 
obtain the final answer, in contrast to any classical algorithm, which needs 
an exponential number of steps. Thus Deutsch’s quantum algorithm leads to 
an exponential speedup. 

A key element of this quantum algorithm and of those discovered later is 
the quantum parallelism involved in step 2, where the linear superposition 
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of the first n qubits comprises the requested global information about the 
function /. For most of the possible functions / this intermediate quantum 
state is expected to be entangled. An exception is the case of a constant func- 
tion /, for which the quantum state \ip 2 ) is separable. Furthermore, it is also 
crucial for the success of this quantum algorithm that the final measurement 
in step 3 yielding the required answer can be implemented by a fast quantum 
measurement whose complexity is polynomial in n. This is a requirement 
fulfilled by all other known fast quantum algorithms. The quantum algo- 
rithm described above was the first example demonstrating that quantum 
phenomena may speed up computations in such a way that an exponential 
gap appears between the complexity class of the quantum problem and the 
complexity class of the corresponding classical probabilistic problem. 

Continuing this development initiated by Deutsch, other, new fast quan- 
tum algorithms were discovered in the subsequent years. The most prominent 
examples are Simon’s quantum algorithm [57], Shor’s celebrated algorithm 
[9] for factorizing numbers, and Grover’s search algorithm [58]. (Quantum al- 
gorithms are discussed in detail in Chap. '.) In addition, possible realizations 
of quantum computing devices were suggested which were based on trapped 
ions [59] and on cavity quantum electrodynamical setups [60]. These devel- 
opments called for new methods for stabilizing quantum algorithms against 
perturbing environmental influences, which tend to destroy quantum inter- 
ference and quantum entanglement [61]. This led to the development of the 
first error-correcting codes [62, 63, 64, 65, 66] by adaptation of classical error- 
correcting techniques to the quantum domain. An introduction to the theory 
of quantum error correction is presented in Chap. 4. 



1.2 Quantum Physics and Information Processing 

What are the common features of these early developments? The common 
element of these early developments in quantum cryptography and quan- 
tum computation is that they all involve the practical processing of informa- 
tion and they are all founded on and facilitated by characteristic quantum 
phenomena. These phenomena, among which the most prominent is entan- 
glement, are in conflict with the classical concepts of physical reality and 
locality. Obviously, these early developments hint at a profound connection 
between the concept of information and some fundamental concepts of quan- 
tum theory, which is also promising from the technological point of view. 
It is these technologically oriented aspects of quantum information theory 
[67, 68, 69] which are at the heart of quantum information processing. 

Methods for processing quantum information have developed rapidly dur- 
ing the last few years [12]. Owing to significant experimental advances, ba- 
sic interference and entanglement phenomena which are of central interest 
for processing quantum information have been realized in the laboratory in 
various physical systems. Basic schemes for quantum communication have 
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been demonstrated with photons [10, 11, 49, 70]. Realizations of elementary 
quantum logical operations have been based on trapped ions [13, 14] and on 
nuclear magnetic resonance [15]. Recent experiments indicate that besides 
cavity quantum electrodynamical setups [16], trapped neutral atoms which 
are guided along magnetic wires (atom chips) might also be useful for quan- 
tum information processing [17]. There have also been theoretical proposals 
on using ultracold atoms in optical lattices [18, 19], on ions in an array of 
microtraps [20] and on solid-state devices [21, 22, 23] for the implementation 
of quantum logical gates. 

By now, quantum information processing has become an interdisciplinary 
subject which attracts not only physicists but also researchers from other 
communities. The common interest is the practical, technologically oriented 
application of characteristic quantum phenomena. At this stage of develop- 
ment, it appears necessary to examine recent achievements and to emphasize 
the underlying, general, basic concepts, which have been developing gradu- 
ally and which are now commonly adopted by all researchers in this field. 
This is one of the main intentions of the rest of the book. 

In Chap. 2, Werner introduces the basic concepts of quantum information 
theory and describes the fundamental mathematical structures underlying re- 
cent and current developments. In particular, this chapter addresses a natural 
question appearing in connection with Feynman’s suggestion, namely what 
can be done with the help of quantum systems and what cannot be done. A 
first example of an impossible quantum process, the copying of nonorthogonal 
quantum states, has already been mentioned. Other examples of possible and 
impossible quantum processes are discussed in detail in this contribution. 

First experimental realizations of basic quantum communication schemes 
based on entangled photon pairs are discussed in Chap. 3 by Weinfurter and 
Zeilinger. These first experiments on entanglement-based quantum cryptog- 
raphy, dense coding and quantum teleportation demonstrate the important 
role photons play in current experiments. Furthermore, these experiments 
also emphasize once again the fundamental significance of entanglement for 
quantum information processing. 

The basic theoretical concepts of quantum computation and the mathe- 
matical structure underlying quantum algorithms are discussed in Chap. 4 
by Beth and Rotteler. In particular, it is demonstrated how recent results in 
the theory of signal processing can be used for the development of new fast 
quantum algorithms. A short introduction to the theory of quantum error 
correction is also presented. 

A comprehensive account of the mathematical structure of entanglement 
and of the significance of mixed entangled states for quantum information 
processing is presented in Chap. 5 by M. Horodecki, P. Horodecki and R. 
Horodecki. One of the most surprising recent developments in this context 
has been the discovery of bound entanglement [71]. Though much is still 
unknown, this section gives a state-of-the-art presentation of what is known 
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about this new form of entanglement and its implications for processing quan- 
tum information. 



2 Quantum Information Theory 
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2.1 Introduction 

Quantum information and quantum computers have received a lot of public 
attention recently. Quantum computers have been advertised as a kind of 
warp drive for computing, and indeed the promise of the algorithms of Shor 
and Grover is to perform computations which are extremely hard or even 
provably impossible on any merely “classical” computer. On the experimental 
side, perhaps the most remarkable feat of quantum information processing 
was the realization of “quantum teleportation” , which once again has science 
fiction overtones. 

In some sense these miracles are an extension of the strangeness of quan- 
tum mechanics - those unresolved questions in the foundations of quantum 
mechanics, which most physicists know about, but few try to tackle directly 
in their research. However, trying to build an explanation of quantum in- 
formation on the literature about the foundations of quantum mechanics is 
more likely to mystify than to clarify. It would also give a wrong idea of 
how discussions in this new field are conducted. Because, just as physicists 
with widely differing convictions on foundational matters can usually agree 
quite easily on what the predictions of quantum mechanics are in a partic- 
ular experimental setup, researchers in quantum information can agree on 
whether a device should work, no matter what they may think about the 
deeper meaning of the wave function. For example, one of the founders of the 
field is an outspoken proponent of the many-worlds interpretation of quan- 
tum mechanics (which I, personally, find useless and bizarre). But, whatever 
the intuitions leading him to his discoveries about quantum computing may 
have been, these discoveries make sense in every other interpretation. 

In this article I shall give an account of the basic concepts of quantum 
information theory, staying as much as possible in the area of general agree- 
ment. So, in order to enter this new field, plain quantum mechanics is enough, 
and no new, perhaps obscure, views are needed. There is, of course, a charac- 
teristic shift in emphasis expressed by the word “information” , and we shall 
have to explore the consequences of this shift. 

The article is divided into two parts. The first (up to the end of Sect. 2.5) 
is mostly in plain English, centered around the exploration of what can or 
cannot be done with quantum systems as information carriers. The second 

G. Alber, T. Beth, M. Horodecki, P. Horodecki, R. Horodecki, M. Rdtteler, H. Weinfurter, 
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part, Sect. 2.6, then gives a description of the mathematical structures and 
of some of the tools needed to develop the theory. 

2.2 What is Quantum Information? 

Let us start with a preliminary definition: 

Quantum information is that kind of information which is carried 

by quantum systems from the preparation device to the measuring 

apparatus in a quantum mechanical experiment. 

So a “transmitter” of quantum information is nothing but a device preparing 
quantum particles, and a “receiver” is just a measuring device. Of course, this 
is not saying much. But even so, it is a strange statement from the point of 
view of classical information theory: in that theory one usually does not care 
about the physical carrier of the information, or else one would have to dis- 
tinguish “electrodynamical information”, “printed information”, “magnetic 
information” and many more. In fact, the success of (classical) information 
theory depends largely on abstracting from the physical carrier, and going 
instead for the general principles underlying any information exchange. So 
why should “quantum information” be any different? 

A moment’s reflection makes clear why the abstraction from the physical 
carrier of information leads to a successful theory: the reason is that it is so 
easy to convert information between all such carriers. The conversion from 
bytes on a hard disk, to currents in a chip, to signals on a cable, to radio 
waves via satellite and maybe, finally, to an image on a computer screen in 
another continent all happens essentially without loss, and if there are losses, 
they are well understood, and it is known how to correct for them. Therefore, 
the crucial question is: can “quantum information” in the above loose sense 
also be converted to those standard classical kinds of information, and back, 
without loss? Or: are there fundamental limitations to such a translation, 
and is quantum information hence really a new kind of information? 

This book would not have been written if the answer to the last question 
were not affirmative: quantum information is indeed a new kind of informa- 
tion. But to make this precise, let us see what would be required of a suc- 
cessful translation. Let us begin with the conversion of quantum information 
to classical information: a device for this conversion would take a quantum 
system and produce as its output some classical information. This is nothing 
but a complicated way of saying “measurement” . The reverse translation, 
from classical to quantum information, obviously involves some preparation 
of quantum systems. The classical input to such a device is used to control 
the settings of the preparing device, and any dependence of the preparation 
process on classical information is admissible. There are two kinds of de- 
vices we can combine from these two elements. Let us first consider a device 
going from classical to quantum to classical information. This is a rather 
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Fig. 2.1. Classical teleportation. Here and in the following diagrams, a wavy arrow 
stands for quantum systems, and a straight arrow for the flow of classical informa- 
tion 



commonplace operation. For example, one can encode one classical bit in 
the polarization degree of freedom of a photon (clearly a quantum system), 
by choosing one of two orthogonal polarizations for the photon, depending 
on the value of the classical bit. The readout is done by a photomultiplier 
combined with a polarization filter in one of the corresponding directions. In 
principle, this allows a perfect transmission. In some sense every transmis- 
sion of classical information is of this kind, because every physical system 
ultimately obeys the laws of quantum mechanics, even if we can often dis- 
regard this fact and treat it classically. Hence classical information can be 
translated into quantum information (and back). 

But what about the converse? This hypothetical (and in fact, impossi- 
ble) process has come to be known as classical teleportation (see Fig. 2.1). It 
would involve a measuring device M, operating on some input quantum sys- 
tems. The results of the measurements are subsequently fed into a preparing 
device P, which produces the final output of the combined device. The task 
is to set things up such that the outputs of the combined device are indistin- 
guishable from the quantum inputs. Of course, we have to say precisely what 
“indistinguishable” should mean. Clearly, this cannot mean that “the same” 
system comes out at the other end. In the classical case this is not demanded 
either. What can only be meant in quantum mechanics is that no statistical 
test will see the difference. In other words, no matter what the preparation 
of the input systems is and no matter what observable we measure on the 
outputs of the teleportation device, we shall always get the same probability 
distribution of results as if the inputs had been directly measured. Note also 
that this criterion does not involve the states of individual systems, but only 
states in the form of the distribution parameters of ensembles of identically 
prepared systems. 

The impossibility of classical teleportation will be treated extensively in 
the following section, where it is related to a hierarchy of impossible machines. 
For a mathematical statement of this impossibility in the standard quantum 
formalism of quantum mechanics, see the remark after (2.7). For the moment, 
however, let us take it for granted, and see what all this says about the new 
concept of quantum information. 

First of all, we are concerned here with problems of transmission, not with 
content or meaning. This is exactly the same as in classical information the- 
ory. There, too, it is often not easy to avoid confusion with a different concept 
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of “information” used in everyday language, namely the kind available at an 
information desk. Information theory does not care whether a TV channel 
is used for “misinformation”, but can say everything about what it takes 
to ensure the technical quality of the final images. Hence the quantitative 
measures of “information” all relate to storage and transmission capacity, to 
the possibilities of compression and error correction and so on. In the same 
vein, quantum information theory will not tell us what the meaning of a 
“quantum message” is, and this is probably meaningless anyway, because a 
message that has been “read” is classical almost by definition. But quantum 
information theory has precise notions of the resources needed to transmit 
such information faithfully. 

Secondly, transmission of quantum information is not at all an exotic 
concept in the context of modern physics. It can be paraphrased in various, 
perhaps more familiar ways, for example as “transmission of intact quantum 
states”, as “coherent transmission of quantum systems” or as transmission 
“preserving all interference possibilities” of the system. Nevertheless the in- 
formation metaphor is useful, not only because it suggests new applications, 
but also because it leads one to ask new questions, and leads to quantitative 
notions where previously there was only a qualitative understanding. And 
possibly this even provides a way to see in a sharper light the old conun- 
drums of the foundations of quantum mechanics. 



2.3 Impossible Machines 

The usefulness of considering impossible machines is well known from ther- 
modynamics: the second law of thermodynamics is often stated as the impos- 
sibility of a perpetual- motion machine. The theorem of the impossibility of 
classical teleportation is likewise a fundamental law of quantum mechanics, 
and a lot can be learned from analyzing it. Typically, the impossible ma- 
chines of quantum theory are perfectly possible in classical physics, so their 
impossibility does not follow superficially from their description, but rather 
carries a connotation of paradox. 

We shall discuss a range of impossible tasks, consisting of 

• teleportation 

• copying (“cloning”) 

• joint measurement 

• Bell’s telephone. 

As we shall see, teleportation is the most powerful of these, in the sense that 
if we had a teleportation device, we could build a quantum copier, from which 
we could in turn construct joint measurements and, finally, a device known 
as Bell’s Telephone, by which we could set up superluminal communication. 
Hence, if we uphold the principle of causality, which forbids the weakest 
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machine in this hierarchy, we are certain that teleportation is likewise im- 
possible. In this section we shall follow this line of reasoning to prove the 
impossibility of teleportation. Of course, there are other, more direct ways of 
proving this result from the structure of quantum mechanics. However, these 
usually require more of the quantum formalism and give less insight into the 
differences between classical and quantum information. 

2.3.1 The Quantum Copier 

This is the machine referred to in the well-known paper of Wootters and 
Zurek entitled “A single quantum cannot be cloned” [!]. By definition, a 
copier would be a device taking one quantum system as input and turning out 
two systems of the same type. The condition for calling this a (faithful) copier 
is that we would not be able to distinguish a system coming from the output 
from the input system by any statistical test, i.e. by means of the probabilities 
measured for any observable, and for any preparation of the initial state. 
Hence the device has to operate on arbitrary “unknown” states. It is clear 
that a copier in the ordinary sense, e.g. a mail relay distributing email to 
several recipients, indeed satisfies this condition in the domain of classical 
information. Note that we are not so unreasonable as to demand what the 
paper quoted above suggests, namely that we could test this device on single 
events, or even assume some ontological “identity” of input and output: the 
criterion for faithful copying is flatly statistical, and can be verified by a 
straightforward collection of statistical tests. 

Given a teleportation device, building a copier is quite easy (see Fig. 2.2). 
All we have to do is to remember that the classical information obtained in 
the intermediate stage of the teleportation process can be copied perfectly. 
Hence we can apply the measuring device of the teleportation line to the 
input system, copy the results, and simply run the reconstructing preparation 
process on each of these copies. 
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Fig. 2.2. Making a copier from a “classical teleportation” line 







2 Quantum Information Theory - an Invitation 19 

2.3.2 Joint Measurement 

This is the task of combining two separate measuring devices into a single 
device, or the “simultaneous measurement” of two quantum observables A 
and B. Thus, a joint measuring device is a device giving a pair (a, b) 

of classical outputs each time it is operated, such that a is a possible output 
of A, and & is a possible output of B. (We use the symbol A to denote both an 
observable and a device that measures this observable, and similar for i3.) We 
require that the statistics of the a outcomes alone are the same as for device 
A, and similarly for B. Note that once again our criterion is statistical, and 
can be tested without recourse to counterfactual conditionals such as “the 
result which would have resulted if B rather than A had been measured on 
this particular quantum particle” . 

Many quantum observables are not jointly measurable in this sense. The 
most famous examples, position and momentum, different components of 
angular momentum, and positions of a free particle at different times, are 
probably contained in every quantum mechanics course. Hence the impossi- 
bility of joint measurements is nothing but a precise statement of an aspect 
of “complementarity” . 

Nevertheless, a joint measurement device for any of these could readily 
be constructed given a functioning quantum copier (see Fig. 2.3): one would 
simply run the copier C on the quantum system, and then apply the two given 
measuring devices, A and B, to the copies. It is easy to see that the definition 
of the copier then guarantees that the statistics of a and b separately come 
out right. In other words, a copier can be seen as a universal joint measuring 
device. 



AA/vVv^- 




Fig. 2.3. Obtaining joint measurements from a copier 



2.3.3 Bell’s Telephone 

This is not named after a certain phone company, but after John S. Bell, 
who never proposed it in this form, but might have. It refers to a project of 
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performing superluminal communication using only correlations of the type 
tested by Bell’s inequalities. Without going into details for the moment, the 
basic setup would consist of a source producing pairs of particles and sending 
one member of each pair to each of the two communicating parties, conven- 
tionally named “Alice” and “Bob” . Each of them has a collection of different 
measuring devices to choose from, and the idea is for Alice to do some- 
thing which creates a noticeable change in the probabilities measured by 
Bob. Clearly, this is a paradoxical task, because no particle or other physi- 
cal carrier of information actually goes from Alice to Bob. Therefore, if the 
particles move sufficiently far apart from one another, this device transmits 
superluminally. 

It is maybe useful to point out here a common confusion concerning such 
superluminal effects, which sometimes even afflicts otherwise reliable profes- 
sional writers. The mistake can usually be spotted easily by a device I call 
the “ping-pong ball test” . It goes like this: 

Take an author’s explanation of Bell’s inequalities, and substitute 
“ping-pong balls” for every quantum particle. Then if whatever the 
author is selling as paradoxical remains true, he/she hasn’t under- 
stood a thing. 

Here is an example: imagine a box containing a ping-pong ball; the box can 
be separated into two parts, without anyone looking at the ball. One part is 
shipped to Tokyo or Alpha Centauri, without anyone looking inside. Then if 
I open the other box I know instantly, i.e. “at super luminal speed”, whether 
the ball is at the distant location or not. Of course, this is true but hardly 
paradoxical, and is totally useless for sending a message either way. To repeat: 
there is nothing paradoxical in statistical correlations per se between distant 
systems with a common past, even if the correlation is perfect. 

If Alice wants to send a message to Bob, correlations between two mea- 
suring devices are useless, because they cannot even be detected without 
comparing the results, which requires exactly the communication the Tele- 
phone was intended for. Only if something Alice does has an effect on the 
measurement results at Bob’s end can we speak of communication. The only 
thing Alice can do in the standard setup is to choose a measuring device, and 
Bell’s Telephone can be said to work if these choices have an influence on the 
probabilities measured by Bob (who has no access to Alice’s measurement 
results). If there is no physical system traveling from Alice to Bob, however, 
this will be impossible. 

To be fair, this can hardly be counted as an impossible machine of quan- 
tum mechanics, since the argument has nothing to do with quantum theory. 
What makes it fit into the hierarchy described here is the following: if we 
assume that Bob has a joint measuring device for two yes/no measurements, 
and Bell’s inequalities are violated, we can design a strategy for Alice to send 
signals to Bob with better than chance results. Hence the joint measurement 
of suitable observables can provide a device sufficiently strong to achieve a 
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Fig. 2.4. Building Bell’s Telephone from a joint measurement 



task forbidden by causality, and hence is impossible in general. This is the 
link between the last two elements in the hierarchy of impossible machines 
mentioned at the beginning of Sect. 2.3. 

The proof of this step amounts to yet another derivation of Bell’s inequal- 
ities, but since it emphasizes the communication aspect it fits well into our 
context, and we shall at least sketch it. This step will be rather more technical 
than the rest of this section, but does not require any quantum theory. The 
argument can be skipped without loss as far as later sections are concerned. 

So let us assume that Alice and Bob each have at their disposal two 
measuring devices, say Ai, A2 and Bi, i?2, respectively. Each of these can give 
a result of either -|-1 or —1. We shall denote by P(a, b \ Ai,Bj) the probability 
for Alice to obtain a and Bob to obtain 6 in a correlation experiment in which 
Alice uses measuring device Ai and Bob uses Bj . By 

C{Ai,Bj) = '^ab P(a, b \ Ai,Bj) 

a,b 

we shall denote the correlation coefficient, which lies between —1 and -1-1. 
The combination 



P = C{A,,Bi) + C{AuB2) + C{A2,Bi) - C{A2,B2) ( 2 . 1 ) 

carries special significance, as we shall see below. Because the inequality /3 < 2 
is known as the Bell inequality (see Sect. 1.1.1), we shall call /3 the Bell corre- 
lation for this choice of four observables. It is a quantity directly accessible to 
experiment. Note that Bob usually cannot tell from his data which apparatus 
(Ai or A2) Alice chose. This is reflected by the equation 

^ P(a, h I Ai, B,) = ^ P(a, b \ A2, B,) = P(6 | B,) , 

a a 

and is borne out by all known experimental data. Now suppose Bob has a 
joint measuring device for his Bi and B2, which we shall denote by B1&B2, 
which produces pair outcomes (61,62) (see Fig. 2.4). We can then determine 








22 



Reinhard F. Werner 



the probabilities Pi{ai,bi,b2) = P(ai, (61,62) | Ai, 81^82). The condition 
that this is really a joint measurement is expressed by the equations 

'^Pi{ai,bi,b2) = 'P{^i,b2 \ Ai,B2) and 

bi 

^^1(0^,61,62) = P(oi,6i I Ai,Bi) , 

each for i = 1,2. The basic rule for the information transmission is the 
following: 

Alice encodes the bit she wants to send by choosing either apparatus 
Ai or apparatus A2- Then Bob looks at his readout and interprets it 
as “Ai ” whenever the two displays coincide (b\ = 62^, and as ‘A2” 
if they are different. 



( 2 . 2 ) 

(2.3) 



We can then estimate the probability Pok for Bob to be right, assuming 
that the choices Ai and A2 are made with the same frequency. Assume first 
that Alice chooses Ai. Then Bob is right with probability 



E 

ai,bi,b2 



61 + 62 

2 



|oi| ^1(01,61,62) , 



where the first factor takes into account the condition 61 = 62, and the second 
is introduced for later convenience. Combining this with a second term of 
similar kind for Alice’s choice A2, and taking into account the probability 
1/2 for each of these choices, we obtain the overall probability Pok for Bob to 
be correct as 



Pok 



■ E 

di,bi,b2 

1 



61 + 62 



|oi| Pl(oi, 61, 62) 



i: 



61 - 62 



^2 ,bl ,&2 



|02| ^2(02,61,62) 



^ (61 + 62)01^1(01,61,62) 

ai,bi,b2 

+i E (61 - 62)01 P2(02, 61, 62) 

d2 ,bi ,&2 

= i (C{A,,B,) + C(Ai, B 2 ) + C{A2,B,) - C{A2,B2)) 

= ^ 

4 ■ 



(2.4) 



Bob is right with a better probability than chance if Pok >1/2, which, by this 
computation, can be guaranteed if /3 > 2, i.e. if the classical Bell inequality 
(in Clauser-Horne-Shimony-Holt form [72]) is violated. But this is indeed the 
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case in experiments conducted to determine (3 (e.g. [73]), which give roughly 
f3 « 2-\/2 « 2.8. If we believe these experiments, the only conclusion can be 
that the joint measurability of the Bi and B 2 used in the experiment would 
be sufficient to make Bell’s Telephone work, which was our claim. 

2.3.4 Entanglement, Mixed-State Analyzers 
and Correlation Resolvers 

Violations of Bell’s inequalities can also be seen to prove the existence of a 
new class of correlations between quantum systems, known as entanglement. 
This concept is as fundamental to the field of quantum information theory 
as the idea of quantum information itself. So rather than organizing this 
introduction as an answer to the the question “why quantum information is 
different from classical information”, we could have followed the line “why 
entanglement is different from classical correlation”. There are impossible 
machines in this line of approach, too, and we shall now describe briefly how 
they fit in. 

Consider a correlation experiment of the kind used in the study of Bell’s 
inequalities (see Sect. 2.3.3). If Bob looks at his particles, and makes mea- 
surements on them without any communication from Alice, he will find that 
their statistics are described by a certain mixed state. The state must be 
mixed, because if he now listens to Alice and sorts his particles according 
to Alice’s measurement results, he will get two subensembles, which are in 
general different. In the usual ideal 2-qubit situation, in which one obtains 
the maximal violation of Bell’s inequalities, these subensembles are described 
by pure states. 

This is very satisfying for people who see the occurrence of mixed states 
in quantum mechanics merely as a result of ignorance, as opposed to the 
deeper kind of randomness encoded in pure states. This view usually comes 
with an individual- state interpretation of quantum mechanics, in which each 
individual system can be assigned a pure state (a single vector in Hilbert 
space), and a general preparation procedure is given not just by its density 
matrix, but by a specific probability distribution of pure states. Let us use 
the term mixed-state analyzer for a hypothetical device which can see the 
difference, i.e. a measuring device whose output after many measurements 
on a given ensemble is not just a collection of expectations of quantum ob- 
servables, but the distribution of pure states in the ensemble. In the case of a 
correlation experiment, where Bob sees a mixed state only because he is igno- 
rant about Alice’s results, this machine would find for him the decomposition 
of his mixed state into two pure states. 

The problem is, of course, that Alice has several choices of measuring de- 
vices, and that the decomposition of Bob’s mixed state depends, accordingly, 
on Alice’s choice. Hence she could signal to Bob, and we would have another 
instance of Bell’s Telephone. There would be a way out if joint measurements 
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were available (to Alice in this case): then we could say that the two decom- 
positions were just the first step in an even finer decomposition, a further 
reduction of ignorance, which would be brought to light if Alice were to ap- 
ply her joint measurement. Presumably the mixed-state analyzer would then 
yield this finer decomposition, because the operation of this device would not 
depend on how closely Alice cared to look at her particles. 

But just as two quantum observables are often not jointly measurable, two 
decompositions of mixed states often have no common refinement (actually, 
in the formalism of quantum theory, these are two variants of the same the- 
orem). In particular, the two decompositions belonging to Alice’s choices in 
an experiment demonstrating a violation of Bell’s inequalities have no com- 
mon refinement, and any mixed-state analyzer could be used for superluminal 
communication in this situation. 

Another device, which is suggested by the individual-state interpretation, 
arises from a naive extrapolation of this view to the parts of a composite 
system: if every single system could be assigned a pure state, a composite 
system could be assigned a pair of pure states, one for each subsystem. A 
correlated state should therefore be given by a probability distribution of 
such pairs. A device which represented an arbitrary state of a composite 
system as a mixture of uncorrelated pure product states might be called a 
correlation resolver. It could be built given a classical teleportation line: when 
one applies teleportation to one of the subsystems and applies conditions 
on the classical measurement results of the intermediate stage, one obtains 
precisely a representation of an arbitrary state in this form. But it is easy to 
see that any state which can be so analyzed automatically satisfies all Bell- 
type inequalities, and hence once again the experimental violations of Bell’s 
inequalities show that such a correlation resolver cannot exist. Hence we 
have here a second line of reasoning in favor of the no-teleportation theorem: 
a teleportation device would allow classical correlation resolution, which is 
shown to be impossible by the Bell experiments. 

The distinction between resolvable states and their complement is one 
of the starting points of entanglement theory, where the “resolvable” states 
are called “separable”, or “classically correlated”, and all others are called 
“entangled” . For a more detailed treatment and an up-to date overview, the 
reader is referred to Chap. 5. 

Without going into philosophical discussions about the foundations of 
quantum mechanics, I should like to comment briefly on the individual-state 
interpretation, which has suggested the two impossible machines discussed in 
this subsection. First, this view is not at all uncommon, and it is quite possi- 
ble to read some passages from the masters of the Copenhagen interpretation 
as an endorsement of this view. Secondly, if we define a hidden-variable theory 
as a theory in which individual systems are described by classical parame- 
ters, whose distribution is responsible for the randomness seen in quantum 
experiments, we have no choice but to call the individual-state interpreta- 
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tion a hidden-variable theory. The hidden variable in this theory is usually 
denoted by ip. And sure enough, as we have just pointed out, this theory 
has all the difficulties with locality such a theory is known to have on gen- 
eral grounds. Thirdly, avoiding an individual-state interpretation, and with 
it some of its misleading intuitions, is easy enough. In practice this is done 
anyhow, by concentrating on those aspects of the theory which have some 
direct statistical meaning, and not on these involving hypothetical, and usu- 
ally impossible devices. This common ground is the statistical interpretation 
of quantum mechanics, in which states (pure or mixed) are the analogues 
of classical probability distributions, and are not seen as a property of an 
individual system, but of a specific way of preparing the system. 



2.4 Possible Machines 

2.4.1 Operations on Multiple Inputs 

The no-teleportation theorem derived in the previous section says that there 
is no way to measure a quantum state in such a way that the measuring 
results suffice to reconstruct the state. At first sight this seems to deny that 
the notion of “quantum states” has an operational meaning at all. But there is 
no contradiction, and we shall resolve the apparent conflict in this subsection, 
if only to sharpen the statement of the no-teleportation theorem. 

Let us recall the operational definition of quantum states, according to 
the statistical interpretation of quantum mechanics. A state is a description 
of a way of preparing quantum systems, and in all its aspects it is related to 
computing expectation values. We might also say that it is the assignment of 
an expectation value to every observable of the system. So to the extent that 
expectation values can be measured, it is possible to determine the state by 
testing it on sufficiently many observables. What is crucial, however, is that 
even the determination of a single expectation value is a statistical measure- 
ment. Hence such a determination requires a repetition of the experiment 
many times, using many systems prepared according to the same procedure. 
In contrast, the above description of teleportation demands that it works 
with a single quantum system as input, and that the measuring device does 
not accumulate results from several input systems. Expressed in the current 
jargon, teleportation is required to be a one-shot operation. Note that this 
does not contradict our statistical criteria for the success of teleportation and 
of other devices, which involve a statistics of independent “single shots” . 

If we have available many identically prepared systems, many operations 
which are otherwise impossible become easy. Let us begin with classical tele- 
portation. Its multiinput analogue is the state estimation problem: how can 
we design a measurement operating on samples of many (say, N) systems 
from the same preparing device, such that the measurement result in each 
case is a collection of classical parameters forming a Hermitian matrix which 
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on average is close to the density matrix describing the initial preparation. 
This is symbolized in Fig. 2.5 (with the box T omitted for the moment): 
the box P at the end represents a repreparation of systems according to the 
estimated density matrix. The overall output will then be a quantum system, 
which can be directly compared with the inputs in statistical experiments. It 
is clear that the state cannot be determined exactly from a sample with fi- 
nite N, but the determination becomes arbitrarily good in the limit iV — > oo. 
Optimal estimation observables are known in the case when the inputs are 
guaranteed to be pure [74], but in the case of general mixed states there are 
no clear-cut theorems yet, partly owing to the fact that it is less clear what 
“figure of merit” best describes the quality of such an estimator. 




Fig. 2.5. Classical teleportation with multiple inputs, or state estimation 



Given a good estimator we can, of course, proceed to good cloning by just 
repeating the repreparation P as often as desired. The surprise here [75] is 
that if only a fixed number M of outputs is required, it is possible to obtain 
better clones with devices that stay entirely in the quantum world than by 
going via classical estimation. Again, the problem of optimal cloning is fully 
understood for pure states [76], but work has only just begun to understand 
the mixed-state case. 

Another operation which becomes accessible in this way is the universal 
not operation, assigning to each pure qubit state the unique pure state or- 
thogonal to it. Like time reversal, this is just a special case of an antiunitary 
symmetry operation. In this case, a strategy using a classical estimation as 
an intermediate step can be shown to be optimal [77] . In this sense “universal 
not” is a harder task than “cloning” . 

More generally, we can look at schemes such as those in Fig. 2.5, where T 
represents any transformation of the density matrix data, whether or not this 
transformation corresponds to a physically realizable transformation of quan- 
tum states. A further interesting application is to the purification of states. 
In this problem it is assumed that the input states were once pure, but were 
later corrupted in some noisy environment (the same for all inputs) . The task 
is to reconstruct the original pure states. Usually, the noise corresponds to an 
invertible linear transformation of the density matrices, but its inverse is not 
a possible operation, because it transforms some density matrices to opera- 
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tors with negative eigenvalues. So the reversal of noise is not possible with a 
one-shot device, but is easy to perform to high accuracy when many equally 
prepared inputs are available. In the simplest case of a so-called depolarizing 
channel, this problem is well understood [78]; it is also well understood in the 
version requiring many outputs, as in the optimal-cloning problem [79]. 

2.4.2 Quantum Cryptography 

It may seem impossible to find applications of impossible machines. But that 
is not quite true: sometimes the impossibility of a certain task is precisely 
what is called for in an application. A case in point is cryptography: here 
one tries to make the deciphering of a code impossible. So if we could design 
a code whose breaking would require one of the machines described in the 
previous section, we could guarantee its security with the certainty of natural 
law. This is precisely what quantum cryptography sets out to do. Because 
only small quantum systems are involved it is one of the “easiest” applica- 
tions of quantum information ideas, and was indeed the first to be realized 
experimentally. For a detailed description we refer to Chap. 3. Here we just 
describe in what sense it is the application of an impossible machine. 

As always in cryptography, the basic situation is that two parties, Alice 
and Bob, say, want to communicate without giving an “evil eavesdropper”, 
conventionally named Eve, a chance to listen in. What classical eavesdrop- 
pers do is to tap the transmission line, make a copy of what they hear for 
later analysis, and otherwise let the signal pass undisturbed to the legitimate 
receiver (Bob). But if the signal is quantum, the no-cloning theorem tells 
us that faithful copying is impossible. So either Eve’s copy or Bob’s copy is 
corrupted. In the first case Eve won’t learn anything, and hence there was 
no eavesdropping anyway. In the second case Bob will know that something 
may have gone wrong, and will tell Alice that they must discard that part 
of the secret key which they were exchanging. Of course, intermediate situa- 
tions are possible, and one has to show very carefully that there is an exact 
trade-off between the amount of information Eve can get and the amount of 
perturbation she must inflict on the channel. 

2.4.3 Entanglement- Assisted Teleportation 

This is arguably the first major discovery in the field of quantum informa- 
tion. The no-cloning and no-teleportation theorems, although they had not 
been formulated in such terms, would hardly have come as a surprise to peo- 
ple working on the foundations of quantum mechanics in the 1960s, say. But 
entanglement assistance was really an unexpected turn. It was first seen by 
Bennett et al. [52] , who also coined the term “teleportation” . It is gratifying 
to see, though it is hardly a surprise on the same scale, that this prediction of 
quantum mechanics has also been implemented experimentally. The experi- 
ments are another interesting story, which will no doubt be told much better 
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Alice Bob 




Fig. 2.6. Entanglement-assisted teleportation 



in Chap. 3 by Weinfurter and Zeilinger, who represent one team in which a 
major breakthrough in this regard was achieved. 

The teleportation scheme is shown in Fig. 2.6. What makes it so surpris- 
ing is that it combines two machines whose impossibility was discussed in the 
previous section: omitting the distribution of the entangled state (the lower 
half of Fig. 2.6), we get the impossible process of classical teleportation. On 
the other hand, if we omit the classical channel, we get an attempt to transmit 
information by means of correlations alone, i.e. a version of Bell’s Telephone. 
Since the time dimension is not represented in this diagram, let us consider 
the steps in the proper order. The first step is that Alice and Bob each receive 
one half of an entangled system. The source can be a third party or can be 
Bob’s lab. The last choice is maybe best for illustrative purposes, because it 
makes clear that no information is flowing from Alice to Bob at this stage. 
Alice is next given the quantum system whose state (which is unknown to 
her) she is to teleport. Alice then makes a measurement on a system made 
by combining the input and her half of the entangled system. She sends the 
results via a classical channel to Bob, who uses them to adjust the settings 
on his device, which then performs some unitary transformation on his half 
of the entangled system. The resulting system is the output, and if every- 
thing is chosen in the right way, these output systems are indeed statistically 
indistinguishable from the outputs. To see just how the entangled state S, 
the measurement M and the repreparation P have to be chosen requires the 
mathematical framework of quantum theory. In the standard example one 
teleports a state of one qubit, using up one maximally entangled two-qubit 
system (in the current jargon, “1 ebit”) and sending two classical bits from 
Alice to Bob. A general characterization of the teleportation schemes for 
qubits and higher-dimensional systems is given below in Sect. 2.6.6. 
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2.4.4 Superdense Coding 

It is easy to see and in fact is a commonplace occurrence that classical in- 
formation can be transmitted on quantum channels. For example, one bit of 
classical information can be coded in any two-level system, such as the polar- 
ization degree of freedom of a photon. It is not entirely trivial to prove, but 
hardly surprising that one cannot do better than “one bit per qubit” . Can we 
beat this bound using the idea of entanglement assistance? It turns out that 
one can. In fact, one can double the amount of classical information carried 
by a quantum channel ( “two bits per qubit” ) . Remarkably, the setups for do- 
ing this are closely related to teleportation schemes, and in the simplest cases 
Alice and Bob just have to swap their equipment for entanglement assisted 
teleportation. This is explained in detail in Sect. 2.6.6. 

2.4.5 Quantum Computation 

Again, we shall be very brief on this subject, although it is certainly central 
to the field. After all, it is partly the promise of a fantastic new class of 
computers which has boosted the interest in quantum information in recent 
years. But since in this book computation is covered in Chap. 4, we shall only 
make a few remarks connecting this subject to the theme of possible versus 
impossible machines. 

So can quantum computers perform otherwise impossible tasks? Not re- 
ally, because in principle we can solve the dynamical equations of quantum 
mechanics on a classical computer and simulate all the results. Hence classi- 
cally imsolvable problems such as the halting problem for Turing machines 
and the word problem in group theory cannot be solved on quantum com- 
puters either. But this argument only shows the possibility of emulating all 
quantum computations on a classical computer, and omits the possibility 
that the efficiency of this procedure may be terrible. The great promise of 
quantum computation lies therefore in the reduction of running time, from 
exponential to polynomial time in the case of Shor’s factorization algorithm 
[80]. This reduction is comparable to replacing the task of counting all the 
way up to a 137 digit number by just having to write it. No matter what the 
constants are in the growth laws for the computing time (and they will prob- 
ably not be very favorable for the quantum contestant), the polynomial time 
is going to win if we are really interested in factoring very large numbers. 

A word of caution is necessary here concerning the impossible/possible 
distinction. While it is true that no polynomial-time classical factoring algo- 
rithm is known, and this is what counts from a practical point of view, there 
is no proof that no such algorithm exists. This is a typical state of affairs 
in complexity theory, because the nonexistence of an algorithm is a state- 
ment about the rather unwieldy set of all Turing machine programs. A proof 
by inspecting all of them is obviously out, so it would have to be based on 
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some principle of “conservation of difficulties”, which rarely exists for real- 
life problems. One problem in which this is possible is that of identifying 
which (unique) element of a large list has a certain property (“needle in a 
haystack”). In this case the obvious strategy of inspecting every element in 
turn can be shown to be the optimal classical one, and has a running time 
proportional to the length N of the list. But Grover’s quantum algorithm 
[58] performs the task in the order of '/N steps, an amazing gain even if it 
is not exponential. Hence there are problems for which quantum computers 
are provably faster than any classical computer. 

So what makes the reduction of running time work? This is not so easy to 
answer, even after working through Shor’s algorithm and verifying the claim 
of exponential speedup. Massive entanglement is used in the algorithm, so 
this is certainly one important element. Then there is a technique known as 
quantum parallelism, in which a quantum computation is run on a coherent 
superposition of all possible classical inputs, and, in a sense, all values of a 
function are computed simultaneously. A catchy paraphrase, attributed to 
D. Deutsch, is to call this a computation in the parallel worlds of the many- 
worlds interpretation. 

But perhaps the best way to find out what powers quantum computation 
is to to turn it around and to really try the classical emulation. The practi- 
cal difficulty which then becomes apparent immediately is that the Hilbert 
space dimensions grow extremely fast. For N qubits (two-level systems), one 
has to operate in a Hilbert space of 2^ dimensions. The corresponding space 
of density matrices has 2^^ dimensions. For classical bits one has instead a 
configuration space of 2^ discrete points, and the analogue of the density ma- 
trices, the probability densities, live in a merely 2 ^-dimensional space. Brute 
force simulations of the whole system therefore tend to grind to a halt even on 
fairly small systems. Feynman was the first to turn this around: maybe only 
a quantum system can be used to simulate a quantum system, and maybe, 
while we are at it, we can go beyond simulation and do some interesting 
computations as well. So, putting it positively, in a quantum system we have 
exponentially more dimensions to work with: there is lots of room in Hilbert 
space. The added complexity of quantum versus classical correlations, i.e. the 
phenomenon of entanglement, is also a consequence of this. 

But it is not so easy to use those extra dimensions. For example, for 
transmission of classical information an iV-qubit system is no better than 
a classical iV-bit system. Only the entanglement assistance of superdense 
coding brings out the additional dimensions. Similarly, quantum computers 
do not speed up every computation, but are good only at specific tasks where 
the extra dimensions can be brought into play. 

2.4.6 Error Correction 

Again, we shall only make a few remarks related to the possible/impossible 
theme, and refer the reader to Chap. 4 for a deeper discussion. First of all. 
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error correction is absolutely crucial for the implementation of quantum com- 
puters. Very early in the development of the subject the suspicion was raised 
that exponential speedup was only possible if all component parts of the com- 
puter were realized with exponentially high (and hence practically unattain- 
able) precision. 

In a classical computer the solution to this problem is digitization: every 
bit is realized by a bistable circuit, and any deviation from the two wanted 
states is restored by the circuit at the expense of some energy and with some 
heat generation. This works separately for every bit, so in a sense every bit 
has its own heat bath. But this strategy will not work for quantum comput- 
ers: to begin with, there is now a continuum of pure states which would have 
to be stabilized for every qubit, and, secondly, one heat bath per qubit would 
quickly destroy entanglement and hence make the quantum computation im- 
possible. There are many indications that entanglement is indeed more easily 
destroyed by thermal noise and other sources of errors; this is summarily re- 
ferred to as decoherence. For example, a Gaussian channel (this is a special 
type of infinite-dimensional channel) has infinite capacity for classical infor- 
mation, no matter how much noise we add. But its quantum capacity drops 
to zero if we add more classical noise than that specified by the Heisenberg 
uncertainty relations [81]. 

A standard technique for stabilizing classical information is redundancy: 
just send a classical bit three times, and decide at the end by majority vote 
which bit to take. It is easy to see that this reduces the probability of error 
from order e to order e^. But quantum mechanically, this procedure is for- 
bidden by the no-cloning theorem: we simply cannot make three copies to 
start the process. 

Fortunately, quantum error correction is possible in spite of all these 
doubts [82]. Like classical error correction, it also works by distributing the 
quantum information over several parallel channels, but it does this in a much 
more subtle way than copying. Using five parallel channels, one can obtain a 
similar reduction of errors from order e to order [63] . Much more has been 
done, but many open questions remain, for which I refer once again to Chap. 
4. 



2.5 A Preview of the Quantum Theory of Information 

Before we go on in the next section to turn some of the heuristic descriptions 
of the previous sections into rigorous mathematical statements, I shall try 
to give a flavor of the theory to be constructed, and of its motivations and 
current state of development. 

Theoretical physics contributes to the field of quantum information pro- 
cessing in two distinct, though interrelated ways. One of these ways is the 
construction of theoretical models of the systems which are being set up ex- 
perimentally as candidates for quantum devices. Of course, any such system 
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will have very many degrees of freedom, of which only very few are singled 
out as the “qubits” on which the quantum computation is performed. Hence 
it is necessary to analyze to what degree and on what timescales it is justi- 
fied to treat the qubit degrees of freedom separately, and with what errors 
the desired quantum operation can be realized in the given system. These 
questions are crucial for the realization of any quantum devices, and require 
specialized in-depth knowledge of the appropriate theory, e.g. quantum op- 
tics, solid-state theory or quantum chemistry (in the case of NMR quantum 
computing) . However, these problems are not what we want to look at in this 
chapter. The other way in which theoretical physics contributes to the field of 
quantum information processing is in the form of another kind of theoretical 
work, which could be called the “abstract quantum theory of information” . 
Recall the arguments in Sect. 2.2, where the possibility of translating be- 
tween different carriers of (classical) information was taken as the justifica- 
tion for looking at an abstracted version, the classical theory of information, 
as founded by Shannon. While it is true that quantum information cannot 
be translated into this framework, and is hence a new kind of information, 
translation is often possible (at least in principle) between different carriers 
of quantum information. Therefore, we can make a similar abstraction in the 
quantum case. To this abstract theory all qubits are the same, whether they 
are realized as polarizations of photons, nuclear spins, excited states of ions in 
a trap, modes of a cavity electromagnetic field or whatever other realization 
may be feasible. A large amount of work is currently being devoted to this 
abstract branch of quantum information theory, so I shall list some of the 
reasons for this effort. 

• Abstract quantum theoretical reasoning is how it all started. In the early 
papers of Feynman and Deutsch, and in the papers by Bennett and co- 
workers, it is the structure of quantum theory itself which opens up all 
these new possibilities. No hint from experiment and no particular the- 
oretical difficulty in the description of concrete systems prompted this 
development. Since the technical realizations are lagging behind so much, 
the field will probably remain “theory driven” for some time to come. 

• If we want to transfer ideas from the classical theory of information to 
the quantum theory, we shall always get abstract statements. This works 
quite well for importing good questions. Unfortunately, however, the an- 
swers are most of the time not transferred so easily. 

• The reason for this difficulty with importing classical results is that some 
of the standard probabilistic techniques, such as conditioning, do not 
work in quantum theory, or work only sporadically. This is the same 
problem that the statistical mechanics of quantum many-particle sys- 
tems faces in comparison with its classical sister. The cure can only be 
the development of new, genuinely quantum techniques. Preferably these 
should work in the widest (and hence most abstract) possible setting. 
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• One of the fascinating aspects of quantum information is that features 
of quantum mechanics which were formerly seen only as paradoxical or 
counterintuitive are now turned into an asset: these are precisely the 
features one is trying to utilize now. But this means that naive intu- 
itive reasoning tends to lead to wrong results. Until we know much more 
about quantum information, we shall need rigorous guidance from a solid 
conceptual and mathematical foundation of the theory. 

• When we take as a selling point for, say, quantum cryptography that 
secrets are protected “with the security of natural law” , the argument is 
only as convincing as the proof that reduces this claim to first principles. 
Clearly this requires abstract reasoning, because it must be independent 
of the physical implementation of the device the eavesdropper uses. The 
argument must also be completely rigorous in the mathematical sense. 

• Because it does not care about the physical realization of its “qubits”, 
the abstract quantum theory of information is applicable to a wide range 
of seemingly very different systems. Consider, for example, some abstract 
quantum gate like the “controlled not” (C-NOT). From the abstract the- 
ory, we can hope to obtain relevant quality criteria, such as the minimal 
fidelity with which this device has to be implemented for some algorithm 
to work. So systems of quite different types can be checked according to 
the same set of criteria, and a direct competition becomes possible (and 
interesting) between different branches of experimental physics. 

So what will be the basic concepts and features of the emerging quantum 
theory of information? The information-theoretical perspective typically gen- 
erates questions like 

How can a given task of quantum information processing be performed 
optimally with the given resources? 

We have already seen a few typical tasks of quantum information process- 
ing in the previous section and, of course, there are more. Typical resources 
required for cryptography, quantum teleportation and dense coding are en- 
tangled states, quantum channels and classical channels. In error correction 
and computing tasks, the resources are the size of the quantum memory 
and the number of quantum operations. Hence all these notions take on a 
quantitative meaning. 

For example, in entanglement-assisted teleportation the entangled pairs 
are used up (one maximally entangled qubit pair is needed for every qubit 
teleported) . If we try to run this process with less than maximally entangled 
states, we may still ask how many pairs from a given preparation device are 
needed per qubit to teleport a message of many qubits, say, with an error less 
than e. This quantity is clearly a measure of entanglement. But other tasks 
may lead to different quantitative measures of entanglement. Very often it is 
possible to find inequalities between different measures of entanglement, and 
establishing these inequalities is again a task of quantum information theory. 
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The direct definition of an entanglement measure based on teleportation, 
of the quantum information capacity of a channel and of many similar quan- 
tities requires an optimization with respect to all codings and decodings of 
asymptotically long quantum messages, which is extremely hard to evaluate. 
In the classical case, however, there is a simple formula for the capacity of a 
noisy channel, called Shannon’s coding theorem, which allows us to compute 
the capacity directly from the transition probabilities of a channel. Finding 
quantum analogues of the coding theorem (and similar formulas for entan- 
glement resources) is still one of the great challenges in quantum information 
theory. 



2.6 Elements of Quantum Information Theory 

It is probably too early to write a definitive account of quantum information 
theory -- there are simply too many open questions. But the basic concepts 
are clear enough, and it will be the task of the remainder of this chapter to 
explain them, and use the precise definitions to state some of the interesting 
open questions in the field. In the limited space available this cannot be done 
in textbook style, with many examples and full proofs of all the things used 
on the way (or even full references of them) . So I shall try to emphasize the 
main lines and to set up the basic definitions using as few primitive concepts 
as possible. For example, the capacities of a channel for either classical or 
quantum information will be defined on exactly the same pattern. This will 
make it easier to establish the relations between these concepts. 

The following sections begin with material which every physicist knows 
from quantum mechanics courses, although maybe not in this form. We need 
to go over this material, though, in order to establish the notation. 

2.6.1 Systems and States 

The systems occurring in the theory can be either quantum or classical, or 
can be hybrids composed of a classical and a quantum part. Therefore, we 
need a mathematical framework covering all these cases. A good choice is to 
characterize each type of system by its algebra of observables. In this chapter 
all algebras of observables will be taken to be finite- dimensional for simplicity. 
Extensions to infinite dimensionality are mostly straightforward, though, and 
in fact a strength of the algebraic approach to quantum theory is that it deals 
not just with infinite-dimensional algebras, but also with systems of infinitely 
many degrees of freedom, as in quantum field theory [83, 84] and statistical 
mechanics [85]. 

The first main type of system consists of purely classical systems, whose 
algebra of observables is commutative, and can hence be considered as a space 
of complex- valued functions on a set X . Our assumption of finiteness requires 
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that X is a finite set, and the algebra of observables A will be C(X), the space 
of all functions f : X ^ C. A single classical bit corresponds to the choice 
X = {0,1}. On the other hand, a purely quantum system is determined by the 
choice A = B{H), the algebra of all bounded linear operators on the Hilbert 
space H. The finiteness assumption requires that H has a finite dimension 
d, so A is just the space Add of complex d x d matrices. A qubit is given by 

A = M2. 

The basic statistical interpretation of the algebra of observables is the 
same in the quantum and classical cases, and hinges on the cone of positive 
elements in the algebra. Here Y is called positive (in symbols, F > 0) if it 
can be written in the form Y = X*X. Then Y G Aid is positive exactly 
if it is given by a positive semidefinite matrix, and / G C(X) is positive iff 
/(x) > 0 for all X. In any algebra of observables A, we shall denote by I G A 
the identity element. 

A state d> on A is a positive normalized linear functional on A. That is, 
: A ^ C is linear, with <P{X*X) > 0 and ^(I) = 1. Each state describes 
a way of preparing systems, in all the details that are relevant to subsequent 
statistical measurements on the systems. The measurements are described by 
assigning to each outcome from a device an effect F G A, i.e. an element with 
0 < E < I. The prediction of the theory for the probability of that outcome, 
measured on systems prepared according to the state p, is then p{F). 

For explicit computations we shall often need to expand states and ele- 
ments of A in a basis. The standard basis in C{X) consists of the functions 
CxtX G X, such that e^iy) = 1 for x = y and zero otherwise. Similarly, 
if 4>fj, G TC is an orthonormal basis of the Hilbert space of a quantum sys- 
tem, we denote by = |e^)(ei.| G B{H) the corresponding “matrix units”. 
Then a state p on the classical algebra C(X) is characterized by the numbers 
Px = p{cx), which form a probability distribution on X, i.e. p{x) > 0 and 
~ Similarly, a quantum state p on B(Ti.) is given by the numbers 
= p(e^^), which form the so-called density matrix. If we interpret them 
as the expansion coefficients of an operator p = Pfj.i/efj,^, the density 
operator of p, we can also write p(A) = tr(pA). 

A state is called pure if it is extremal in the convex set of all states, i.e. 
if it cannot be written as a convex combination Ap' -I- (1 — X)p" of other 
states. These are the states which contain as little randomness as possible. 
In the classical case, the only pure states are those concentrated on a single 
point z G X, i.e. Pz = 1, or p{f) = f{z). The pure states in the quantum 
case are determined by “wave vectors” if G 7i such that p(A) = {if,Aif), 
and p = \if){ip\. Thus, in the simplest case of a classical bit, there are just 
two extreme points, whereas in the case of a qubit the extreme points form 
a sphere in three dimensions and are given by the expectations of the three 
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Fig. 2.7. State spaces as convex sets: left, one classical bit; right, one quantum bit 
(qubit) 



Pauli matrices: 



Then positivity requires |a:p < 1, with equality when p is pure. This is shown 
in Fig. 2.7. 

Thus, in addition to the north pole |1) and the south pole |0), which 
roughly correspond to the extremal states of the classical bit, we have their 
coherent superpositions corresponding to the wave vectors a|l) + /3|0), where 
a,/3 G C, and \a\^ + |/3p = 1. This additional freedom becomes even more 
dramatic in higher-dimensional systems, and is crucial for the possibility of 
entanglement. 

Entanglement is a property of states of composite systems, so we must 
introduce the notion of composition of systems. We shall define this in a 
way which applies to classical and quantum systems alike. If A and B are 
the algebras of observables of the subsystems, the algebra of observables of 
the composition is defined to be the tensor product A <S> B. In the finite- 
dimensional case, which is our main concern, this is defined as the space of 
linear combinations of elements that can be written a,s A (S) B with A & A 
and B G B, such that A (S> B is linear in A and linear in B. The algebraic 
operations are defined by {A ® B)* = A* 0 B* , and {A\ ® Bi){A 2 ® B 2 ) = 
{AiA 2 )®{BiB 2 ). Thus I = I _4 0 Ig. Since positivity is defined in terms of a 
star operation (adjoint) and a product, these definitions also determine the 
states and effects of the composite system. 

Let us explore how this unifies the more common definitions in the clas- 
sical and quantum cases. For two classical factors C{X) 0 C{Y), a basis is 
formed by the elements 63 , 0 Cy, so the general element can be expanded as 




Xk = P{<Xk) ■ 



(2.5) 
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and each element can be identified with a function on the Cartesian product 
X X Y. Hence C(X) ®C(y) = C{X x Y). Similarly, in the purely quantum 
case, we can expand in matrix units and obtain quantities with four indices: 
{A® In a basis-free way, i.e. when A, B are considered 

as operators on Hilbert spaces Haj'Hb, this is defined by the equation 

{A 'Y> B){4>'Y> Ip) = (A(p) (g) (Bip) , 

where </> G Ha and ip GHb, and the tensor product of the Hilbert spaces is 
formed in the usual way. Hence B{Ha) B(Hb) = B[Ha ® Hb)- 

But the definition of a composition by a tensor product of algebras of ob- 
servables also determines how a quantum-classical hybrid must be described. 
Such systems occur frequently in quantum information theory, whenever a 
combination of classical and quantum information is given. We shall approach 
hybrids in two equivalent ways, which are also useful more generally. Suppose 
we know only that the first subsystem is classical and make no assumptions 
about the nature of the second, i.e. we want to characterize tensor prod- 
ucts of the form C{X) (g) B. Then every element can be expanded in the 
form B = ® where now B^ G B. Clearly, the elements B^ de- 

termine B, and hence we can identify the tensor product with the space 
(sometimes denoted by C(X;S)) of ;B- valued functions on X with pointwise 
algebraic operations. Similarly, suppose that we know only that B = A4d is 
the algebra of d x d matrices. Then, expanding in matrix units, we find that 
A = ® S That is, we can identify A ® Md with 

the space (sometimes denoted by Xid{A)) oi dx d matrices with entries from 
A. By using the relation one can readily verify that the 

product in A® M.d indeed corresponds to the usual matrix multiplication 
in A4d(A), with due care given to the order of factors in products with ele- 
ments from A, if A happens to be noncommutative. The adjoint is given by 
{A*)i_u, = {A^fj,)*. Hence a hybrid algebra C{X) ® Md can be viewed either 
as the algebra of C(X)-vahied d x d matrices or as the space of A^^-valued 
functions on X . 

The physical interpretation of a composite system A®B in terms of states 
and effects is straightforward. When F G A and G G B are effects, so is F®G, 
and this is interpreted as the joint measurement of F on the first subsystem 
and of G on the second subsystem, where the “yes” outcome is taken as “both 
effects give yes” . In particular, F0 Ig corresponds to measuring F on the first 
system, completely ignoring the second. Thus, for any state p on A® B we 
define the restriction pA oi p to A by pa{-A) = p(H(gIg). In the classical case, 
the probability density for pA is obtained by integrating out the B variables. 
In the quantum case, it corresponds to the partial trace of density matrices 
with respect to Hb- In general, it is not possible to reconstruct the state p 
from the restrictions pA and pg, which is another way of saying that p also 
describes correlations between the systems. However, given pA and pB, there 
is always a state with these restrictions, namely the tensor product pA ® Pb, 
which corresponds to an independent preparation of the subsystems. 
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A fundamental difference between quantum and classical correlations lies 
in the nature of pure states of composite systems. Classically the situation is 
easy: a pure state of the composite system C{X) (g) C{Y) = C{X x Y) is just 
a point (x, y) G X x Y. Obviously, the restrictions of this state are the pure 
states concentrated on x and y, respectively. More generally, whenever one of 
the algebras in A 0 yS is commutative, every pure state will restrict to pure 
states on the subsystems. Not so in the purely quantum case. Here the pure 
states are given by unit vectors in the tensor product and unless 

<P happens to be of the special form 4>a ^ 4>b (and not a linear combination 
of such vectors), the state will not be a product, and the restrictions will not 
be pure. The following standard form of vectors in a tensor product, known 
as the Schmidt decomposition, is used in entanglement theory every day, and 
twice on Sundays. 

Lemma 2.1. (1) (“Schmidt decomposition”) Let <L> G Ha ®Hb be a unit 
vector, and let pA denote the density operator of its restriction to the first 
factor. Then if pA = (with Xfj, > 0) is the spectral resolution, 

we can find an orthonormal system e'^ gHb such that 

^ XI VX 0 e'^ . 

V 

(2) (“Purification”) An arbitrary quantum state p on H can he extended to 
a pure state on a larger system with a Hilbert space H®Hb- Moreover, the 
restricted density matrix pB can be chosen to have no zero eigenvalues, and 
with this additional condition the space Hb and the extended pure state are 
unique up to a unitary transformation. 

Proof. (1) We may expand <P as with suitable vectors 

'Tfi GHb- The reduced density matrix is determined by 

H{paF) = {<P, (A0 I)^>) = '^{e^,Ae,f}{ili^,ili,f) = X ■ 

fj.lt fj 

Since A is arbitrary (e.g. A = |ea)(e^|), we may compare coefficients, and 
obtain Hence e'^ = is the desired orthonormal 

system. 

(2) The existence of the purification is evident if one defines T> as above, 
with the orthonormal system e!^ chosen in an arbitrary way. Then ps = 
S/i and the above computation shows that choosing the basis 

is the only freedom in this construction. But any two bases are linked by a 
unitary transformation. □ 

A nonproduct pure state is a basic example of an entangled state in the 
sense of the following definition: 
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Definition 2.1. A state p on A® B is called separable (or “classically cor- 
related”) if it can be written as 

with states and p^ on A and B, respectively, and weights > 0. Other- 
wise, p is called entangled. 

Thus a classically correlated state may well contain nontrivial correlations. 
In fact, if either A or B is classical, every state is classically correlated. What 
the definition expresses is only that we may generate these correlations by a 
purely classical mechanism. We can use a classical random generator, which 
produces the result “/r” with probability A^, together with two preparing 
devices operating independently but receiving instructions from the random 
generator: p^ is the state produced by the A device if it receives the input 
“p” from the random generator, and similarly for B. Then the overall state 
prepared by this setup is p, and clearly the source of all correlations in this 
state lies in the classical random generator. 

For an extensive treatment of these concepts the reader is now referred 
to Chap. 5. We shall turn instead to the second fundamental type of objects 
in quantum information theory, the channels. 

2.6.2 Channels 

Any processing step of quantum information is represented by a “channel” . 
This covers a great variety of operations, from preparation to time evolution, 
measurement, and measurement with general state changes. Both the input 
and the output of a channel may be an arbitrary combination of classical 
and quantum information. The combination of different kinds of inputs or 
outputs causes no special problems of formulation: it simply means that the 
algebras of observables of the input and output systems of a channel must 
be chosen as suitable tensor products. 

The basic idea of the mathematical description of a channel is to charac- 
terize the transformation T in terms of the way it modifies subsequent mea- 
surements. Suppose the channel converts systems with algebra A into systems 
with algebra B. Then, by applying first the channel, and then a yes/no mea- 
surement F on the B-type output system, we have effectively measured an 
effect on the A- type system, which will be denoted by T{F). Hence a channel 
is completely specified by a map T : B ^ A, and we shall say, for simplicity, 
that this map is the channel. There is, of course an alternative way of viewing 
a channel, namely as a map taking input states to output states, i.e. states 
on A into states on B, which we we shall denote by T*. We shall say that 
T describes the channel in the Heisenberg picture, whereas T* describes the 
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same channel in the Schrodinger picture. The descriptions are connected by 
the equation 

[nip)]{F) = p[T{F)] ( 2 . 6 ) 

where p is an arbitrary state on A, and F G B is also arbitrary. The notation 
on the left-hand side is sometimes a little clumsy; therefore we shall often 
write T*(p) = p oT, where “o” denotes a composition of maps, in this case 
from B to A to <C. A composition of channels will then also be written in the 
form SoT. This has the advantage that things are written from left to right in 
the order in which they happen: first the preparation, then some channels, and 
finally the yes/no measurement F. As a further simplification, we shall often 
follow the convention of dropping the parentheses of the arguments of linear 
operators (e.g. T(A) = TA) and dropping the o symbols, but reintroducing 
any of these elements for punctuation whenever they help to make expressions 
unambiguous or just more readable. 

For many questions in quantum information theory it is crucial to have a 
precise notion of the set of possible channels between two types of systems: 
clearly, the distinction between “possible” and “impossible” in Sect. 2.3 is of 
this kind, but the search for the “optimal device” performing a certain task 
is also of this kind. There are two different approaches for defining the set of 
maps T ■. B ^ A which should qualify as channels, and luckily they agree. The 
first approach is axiomatic, one just lists the properties of T which are forced 
on us by the statistical interpretation of the theory. The second approach is 
constructive: one lists operations which can actually be performed according 
to the conventional wisdom of quantum mechanics and defines the admissible 
channels as those which can be assembled from these building blocks. The 
equivalence between these approaches is one of the fundamental theorems in 
this field, and is known as the Stinespring dilation theorem. We shall state 
this theorem after describing both approaches and giving a formal definition 
of “channels”. 

Note that the left-hand side of (2.6) is linear in F, which reflects the 
fact that a mixture of effects (“use effect Fi in 42% of the cases and F 2 
in the remaining cases”) directly becomes a mixture of the corresponding 
probabilities. Therefore, the right-hand side also has to be linear in F, i.e. 
T : B ^ A must be a linear operator, from the statistical interpretation of the 
theory. Obviously, T also has to take positive operators F into a positive T{F) 
(“T is positive”), and the trivial measurement has to remain trivial: Tig = 

(“T is unit preserving, or unital”). This is equivalent to T» being likewise 
a positive linear operator, with the normalization condition trT*(p) = tr p. 
Finally, we would like to have an operation of “running two channels in 
parallel” , i.e. we would like to define T (g) S' : ^ A 2 0 S 2 for arbitrary 

channels T : — *■ A 2 and S : B\ ^ B 2 . Since the identity I„ on an n- 

level quantum system is one of the channels we want to consider, we 
must demand that T 0 I„ also takes positive elements to positive elements. 
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This “complete positivity” of T is a nontrivial requirement for maps between 
quantum systems. If A or B is classical, any positive linear map from ^ to ;B 
is automatically completely positive. For arbitrary completely positive maps, 
the product T'^S is defined and is again completely positive, so just requiring 
the ability to form a tensor product with the “innocent bystander” suffices 
to make all parallel channels well defined. 

Definition 2.2. A channel converting systems with an algebra of observables 
A to systems with an algebra of observables B is a completely positive, unit- 
preserving linear operator T : B ^ A. 

In the “constructive” approach one allows only maps which can be built 
from the basic operations of (1) tensoring with a second system in a speci- 
fied state, (2) unitary transformation and (3) reduction to a subsystem. Let 
us describe these and some other basic channels more formally, if only to 
show the richness of this concept. We leave the verification of the channel 
properties, including complete positivity, to the reader. 

• Expansion. This expands system A by system B in the state p', say. Thus 
T*(p) = p® p' , or, from (2.6), T : A® B ^ A with T{A ® B) = p'{B)A. 

• Restriction. In the Heisenberg picture, the operation of discarding system 
B from the composite system A®B is T : A ^ A®B, with T{A) = A®l.s- 
As noted before, this corresponds to taking partial traces if B is quantum, 
and to an integration over Y if B = C{Y) is classical. 

• Symmetry. By definition, the symmetries of a quantum system with 

an algebra of observables A are the invertible channels, i.e. channels 
T : A — *■ A such that there is a channel S with ST = = I^. It turns 

out that these are precisely the automorphisms of A, i.e. invertible linear 
maps T ■. A^ A such that T{AB) = T{A)T{B), and T{A*) = T{A)*. 
For a pure quantum system the symmetries are precisely the unitarily 
implemented maps, i.e. the maps of the form T{A) = UAU*, where U is 
a unitary element of A. To readers familiar with Wigner’s theorem (e.g. 
corollary 3.3 in [86]), another class of maps is conspicuously absent here, 
namely positive maps of the form T{A) = 0A*0* where 0 is antiunitary. 
It is well known that owing to the positivity of energy, a time-reversal 
symmetry can be implemented only by such an antiunitary transforma- 
tion. But since such symmetries are not completely positive, they can only 
be global symmetries, and can never occur as symmetries affecting only 
a subsystem of the world. 

• Observable. A measurement is simply a channel with a classical output 

algebra, say B = C{X). Obviously, T : B ^ A is uniquely determined by 
the collection of operators = T{ex) via T f = The channel 
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property of T is equivalent to 



G A , Fx > 0 , Fx — 1-A ■ 



X 



Either the “resolution of the identity” {Er} or the channel T will be 
called an observable here. This differs in two ways from the usual text- 
book definitions of this term: firstly, the outputs x G X need not be 
real numbers, and secondly, the operators F^, whose expectations are the 
probabilities for obtaining the output x, need not be projection opera- 
tors. This is sometimes expressed by calling T a generalized observable or 
a POVM (positive operator- valued measure). This is to distinguish them 
from the old-style “nongeneralized” observables, which are called PVMs 
(projection- valued measures) because = F^. 

• Separable Channel. A classical teleportation scheme is a composition of 
an observable and a preparation depending on a classical input, i.e. it is 
of the form 



where the F^ form an observable, and px is the reconstructed state when 
the measurement result is x. Equivalently, we can say that T = RS, 
where ‘input of S’ = ‘output of R’ is a classical system with an algebra 
of observables C{X). The impossibility of classical teleportation, in this 
language, is the statement that no separable channel can be equal to the 
identity. 

• Instrument. An observable describes only the statistics of the measuring 
results, and contains no information about the state of the system after 
the measurement. If we want such a more detailed description, we have to 
count the quantum system after the measurement as one of the outputs. 
Thus we obtain a composite output algebra C(X) 0 B, where X is the 
set of classical outcomes of the measurement and B describes the output 
systems, which can be of a different type in general from the input systems 
with an algebra of observables A. The term “instrument” for such devices 
was coined by Davies [86] . As in the case of observables, it is convenient 
to expand in the basis {Cx} of the classical algebra. Thus T : C{X)^B — > 



A can be considered as a collection of maps Tx ■ B ^ A, such that 
T(/ (g) _B) = J2x f{^)Fx{,B). The conditions on Tx are 

Tx : B ^ A completely positive, and ^Ta,(I) = I . 



Note that an instrument has two kinds of “marginals”: we can ignore 
the B output, which leads to the observable Fx = T'x(Ie), or we can 




(2.7) 



X 



X 
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ignore the measurement results, which gives the overall state change T = 

Von Neumann Measurement. A von Neumann measurement is a special 
case of an instrument, associated with a family of orthogonal projections, 
i.e. Px & A with p%Py = 5xyPx and Px = I- These define an instrument 
T : C{X) ® A ^ A via Tx{A) = pxApx- What von Neumann actually 
proposed [87] was the version of this with one-dimensional projections 
Px, so the general case is sometimes called an incomplete von Neumann 
measurement or a Liiders measurement. The characteristic property of 
such measurements is their repeatability, since TxTy = 0 for x ^ y, re- 
peating the measurement a second time (or any number of times) will 
always give the same output. For this reason the “projection postulate”, 
which demanded that any decent measurement should be of this form, 
dominated the theory of quantum measurement processes for a long time. 

Classical Input. Classical information may occur in the input of a device 
just as well as in the output. Again this leads to a family of maps Tx : 
B ^ A such that T : B ^ C{X) ® A, with T{B) = J2x ® Tx{B). The 
conditions on {Tx} are 

Tx ■ B ^ A completely positive, and Tx{l) = I . 

Note that this looks very similar to the conditions for an instrument, but 
the normalization is different. An interesting special case is a “prepara- 
tor”, for which A = C is trivial. This prepares B states that depend in 
an arbitrary way on the classical input x. 

Kraus Form. Consider quantum systems with Hilbert spaces Ha and 
Hb, and let K : Ha ^ TYb be a bounded operator. Then the map 
Tk{B) = K*BK from B{Hb) to B{Ha) is positive. Moreover, Tk 0 In 
can be written in the same form, with K replaced by AT ® I„. Hence Tk 
is completely positive. It follows that maps of the form 

T{B) = Y, KBKx , where ^ K^Kx = I , (2.8) 

X X 

are channels. It is be a consequence of the Stinespring theorem that any 
channel B{Hb) to B{Ha) can be written in this form, which we call the 
Kraus form, following current usage. This refers to the book [88], which 
is a still recommended early account of the notion of complete positivity 
in physics. 

Ancilla Form. As stated above, every channel, defined abstractly as a 
completely positive normalized map, can be constructed in terms of sim- 
pler ones. A frequently used decomposition is shown in Fig. 2.8. The 
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Fig. 2.8. Representation of an arbitrary channel as a unitary transformation on a 
system extended by an ancilla 



input system is coupled to an auxiliary system A, conventionally called 
the “ancilla” (“maidservant”). Then a unitary transformation is carried 
out, e.g. by letting the system evolve according to a tailor-made interac- 
tion Hamiltonian, and finally the ancilla (or, more generally, a suitable 
subsystem) is discarded. 

The claim that every channel can be represented in the last two forms is 
a direct consequence of the fundamental structural theorem for completely 
positive maps, due to Stinespring [89]. We state it here in a version adapted 
to pure quantum systems, containing no classical components. 

Theorem 2.1. (Stinespring Theorem). Let T : ^ Atm be a completely 

positive linear map. Then there is a number i and an operator V : C™ ^ 
C” 0 such that 

T{X) = V*{X(g)Ii)V , (2.9) 

and the vectors of the form (X ® where X G A4n and (j) € C™, span 

C" ® . This decomposition is unique up to a unitary transformation of . 

The ancilla form of a channel T is obtained by tensoring the Hilbert spaces 
C™ and C” 0 with suitable tensor factors C“ and C^, so that ma = nib. 
One picks pure states in ifa € and ifb G and looks for a unitary 
extension of the map Vcf^ipa = (l^</>) 0 t/'b- There are many ways to do this, 
and this is a weakness of the ancilla approach in practical computations: one 
is always forced to specify an initial state ifa of the ancilla and many matrix 
elements of the unitary interaction, which in the end drop out of all results. 
As the uniqueness clause in the Stinespring theorem shows, it is the isometry 
V which neatly captures the relevant part of the ancilla picture. 

In order to obtain the Kraus form of a general positive map T from its 
Stinespring representation, we choose vectors 4>x G such that 

\Xx){Xx\ = I , 



( 2 . 10 ) 
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and define Kraus operators for T by {(j), K^ip) = {(j) ® Xx^Vip) (we leave 
the straightforward verification of (2.8) to the reader). Of course, we can take 
the Xx as an orthonormal basis of C^, but overcomplete systems of vectors 
do just as well. 

It turns out that all Kraus decompositions of a given completely posi- 
tive operator are obtained in the way just described. This follows from the 
following theorem, which solves the more general problem of finding all de- 
compositions of a given completely positive operator into completely positive 
summands. In terms of channels, this problem has the following interpreta- 
tion: for an instrument {Tj,}, the sum T = describes the overall state 

change, when the measurement results are ignored. So the reverse problem 
is to find all measurements which are consistent with a given overall state 
change (perturbation) of the system, or, in physical terms, all delayed- choice 
measurements consistent with a given interaction between the system and its 
environment. By analogy with results for states on abelian algebras (proba- 
bility measures) and states on C* algebras, we call the following theorem the 
Radon-Nikodym theorem. For a proof see [90] . 

Theorem 2.2. (Radon-Nikodym Theorem). Let > Mm, x G X 

be a family of completely positive maps, and let V : C™ — > C" (8> be 
the Stinespring operator of T = J^xTx-Then there are uniquely determined 
positive operators G Mi with = I such that 

Tx{X) = V%X®Fx)V. 

A simple but important special case is the case i. = V. then, since C" 0 
C = C”, we can just omit the tensor factor C^. The Stinespring form is 
then exactly that of a single term in the Kraus form with Kraus operator 
K = V. The Radon-Nikodym part of the theorem then says that the only 
decompositions of T into completely positive summands are decompositions 
into positive multiples of T. Such maps are called “pure” . Since the identity 
and, more generally, symmetries are of this type, we obtain the following 
corollary: 

Corollary 2.1. {“No information without perturbation”). Let T : C{X) ® 
Mn —>■ Mn be an instrument with a unitary global state change T{A) = 
T(1 (g) A) = U*ALf. Then there is a probability distribution p^ such that 
Tx = PxT, and the probability p[r 2 ,(I)] = px for obtaining the measurement 
result X is independent of the input state p. 

2.6.3 Duality between Channels and Bipartite States 

There are many connections between the properties of states on bipartite 
systems, and channels. For example, if Alice has locally created a state, and 
wants to send one half to Bob, the properties of the channel available for that 
transmission are crucial to the kind of distributed entangled state they can 
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create in this way. For example, if the channel is separable, the state will also 
be separable. 

Mathematically, the kind of relationship we shall describe here is very 
reminiscent of the relationship between bilinear forms and linear operators: 
an operator from an n-dimensional vector space to an m-dimensional vector 
space is parametrized by an n x m matrix, just like a bilinear form with 
arguments from an n-dimensional and an m-dimensional space. It is there- 
fore hardly surprising that the matrix elements of a density operator on a 
tensor product can be reorganized and reinterpreted as the matrix elements 
of an operator between operator spaces. What is perhaps not so obvious, 
however, is that the positivity conditions for states and for channels exactly 
match up in this correspondence. This is the content of the following Lemma, 
graphically represented in Fig. 2.9. 





Fig. 2.9. The duality scheme of Lemma 2.2; an arbitrary preparation P is uniquely 
represented as a preparation S of a pure state and the application of a channel T 
to half of the system 



Lemma 2.2. Let p be a density operator onli.® K,. Then there is a Hilbert 
space TL' , a pure state a onTL®TL' and a channel T : B{1C) B{H') such 

that 



p=ao(I^0T). (2.11) 

Moreover, the restriction of a to TC can be chosen to be nonsingular, and in 
this case the decomposition is unique in the sense that any other decomposi- 
tion p = a' o (I-^ (g) T') is of the form a' = a o R and T' = R~^T , with a 
unitarily implemented channel R. 

It is clear that a must be the purification of p, restricted to the first factor. 
Thus we may set a = \\T){T\, where T = Ck 0 ej.; > 0 are the 

nonzero eigenvalues of the restriction of p to the first system, and e). is a basis 
of TL' . Note that the ej. are indeed unique up to a unitary transformation, so 
we only have to show that for one choice of ej. we obtain a unique T. From 
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the equation p = a o (I^ ® T’), we can then read off the matrix elements of 
T: 



'7"(|eM)(e^D e^) . (2.12) 

We have to show that T, as defined by this equation, is completely positive 
whenever p is positive. For fixed coefficients Vk the map p i— > T is obviously 
linear. Hence it suffices to prove complete positivity for p = |(p)(tp|. But in 
that case T = V* AV , with {ey, Ve'^ = {et ® ey,ip), so T is indeed 
completely positive. The normalization T(I) = I follows from the choice of 
rfc, and the lemma is proved. 

The main use of this lemma is to translate results about entangled states 
to results about channels, and conversely. For this it is necessary to have 
a translation table of properties. Some entries are easy: for example, p is a 
product state iff T is depolarizing in the sense that T{A) = tr(p27l) for some 
density operator p 2 , and p is separable in the sense of Definition 2.1 iff T is 
separable (see (2.7)). 

2.6.4 Channel Capacity 

In the definition of channel capacity, we shall have to use a criterion for the 
approximation of one channel by another. Since channels are maps between 
normed spaces, one obvious choice would be to use the standard norm 

||5-T||:=sup{||5(H)-T(H)|| | P|| < l} . (2.13) 

However, as in the case of positivity, there is a problem with this definition 
when one considers tensor products: the norms ||T ® I„||, where is the 
identity on At™, may increase with n. This introduces complications when 
one has to make estimates for parallel channels. Therefore we stabilize the 
norm with respect to tensoring with “innocent bystanders” , and introduce, 
for any linear map T between C* algebras, the norm 

||T||cb :=sup||T0l„|| , (2.14) 

n 

called the norm of complete boundedness, or “cb norm” for short. This name 
derives from the observation that on infinite-dimensional C* algebras the 
above supremum may be infinite even though each term in the supremum 
is finite. By definition, a completely bounded map is one with ||T||cb < oo. 
On a finite-dimensional C* algebra, every linear map is completely bounded: 
for maps into Md we have ||T||cb < ^l|T"ll- (As a general reference on these 
matters, I recommend the book [91].) One might conclude from this that 
the distinction between these norms is irrelevant. However, since we shall 
need estimates for large tensor products, every factor that increases with 
dimension can make a decisive difference. This is the reason for employing 
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the cb norm in the definition of channel capacity. It will turn out, however, 
that in the most important cases one has only to estimate differences from 
the identity, and ||T — I|| and ||T — I||cb can be estimated in terms of each 
other with dimension-independent bounds. 

The basis of the notion of channel capacity is a comparison between the 
given channel T : A 2 Ai and an “ideal” channel S : Bi ^ 82 - The 
comparison is effected by suitable encoding and decoding transformations 
E : Ai ^ Bi and D : B 2 ^ A 2 so that the composed operator ETD : B 2 — > 
Bi is a map which can be compared directly with the ideal channel S. Of 
course, we are only interested in such a comparison in the case of optimal 
encoding and decoding, i.e. in the quantity 

A{S,T) = mi \\S - ETD\U , (2.15) 

E,D 

where the infimum is over all channels (i.e. all unit-preserving completely 
positive maps) E and D with appropriate domain and range. Since these 
data are at least implicitly given together with S and T, there is no need 
to specify them in the notation. S should be thought of as representing one 
word of the kind of message to be sent, whereas T represents one invocation 
of the channel. Channel capacity is defined as the number of S words per 
invocation of the channel T which can be faithfully transmitted, with suitable 
encoding and decoding for long messages. Here “messages of length n” are 
represented by the tensor power 5'®", and “m invocations of the channel T” 
are represented by the tensor power T®™. 

Definition 2.3. Let S and T be ehannels. Then a number c > 0 is ealled 
an achievable rate for T with respeet to S if, for any sequences na,ma of 
integers with nia 00 and limsup^ (ria/ma) < c, we have 

limZ\(5'®”“,T®™“) = 0. 

The supremum of all achievable rates is called the capacity of T with respect 
to S, and is denoted by C{S,T). 

Note that by definition, 0 is an achievable rate (no integer sequences with 
asymptotically negative ratio exist), and hence C{S,T) > 0. If all c > 0 are 
achievable, then of course we write C{S,T) = 00 . It may be cumbersome to 
check all pairs of integer sequences with a given upper ratio when testing 
c. However, owing to the monotonicity of A, it suffices to check only one 
sequence, provided it is not too sparse: if there is any pair of sequences 
na,m,a satisfying the conditions in the definition (including — > 0) plus the 

extra requirement that {rria/ma+i) 1, then c is achievable. 

The ideal channel for systems with an algebra of observables A is by 
definition the identity map I _4 on A. For typographical convenience we shall 
abbreviate to “yl” whenever it appears as an argument of A or C. Using 



2 Quantum Information Theory - an Invitation 



49 



this notation, we shall now summarize the capacities of ideal quantum and 
classical channels. Of course, these are basic data for the whole theory: 

C(Mk,Cn) = 0 forfc>2, (2.16) 

C(Cfc,C„) = C{Mk,Mn) = C{Mk,Cn) = . (2.17) 

log k 

Here the first equation is the capacity version of the no-teleportation theorem: 
it is impossible to transport any quantum information on a classical channel. 
The second equation shows that for capacity purposes, Mn is indeed best 
compared with C„. In classical information theory one uses the one-bit system 
C 2 as the ideal reference channel. Similarly, we use the one-qubit channel as 
the reference standard for quantum information , i.e. we define the classical 
capacity Cc{T) and the quantum capacity Cg(T) of an arbitrary channel by 

C'c(T) = C'(C2,T), (2.18) 

Cg{T) = C{M2,T) . (2.19) 

Combining the results (2.17) with the “triangle inequality”, or two-step coding 
inequality, 



c(Ti,T 3) > c(ri,T2)C'(T2,r3), (2.20) 

we see that this is really only a choice of units, i.e. for arbitrary channels 
T we obtain C{Mn,T) = (log2/logn)C'(Ad2,T), and a similar equation for 
classical capacities. Note that the term “qubit” refers to the reference system 
M 2 , but it is not advisable to use “qubit” as a special unit for quantum 
information (rather than just “bit”): this would be like distinguishing between 
the units “vertical meter” and “horizontal meter” and would create problems 
in every equation in which the two capacities were directly compared. The 
simplest relation of this kind is 

Cg{T) < C,{T) , (2.21) 

which follows from combining (2.20) with (2.17). Note that both definitions 
apply to arbitrary channels T, whether the input and/or output are classical 
or quantum or hybrids. In order for a channel to have a positive quantum 
capacity, it is necessary that both the input and the output are quantum 
systems. This is shown by combining (2.16) with the bottleneck inequality 

C{S,T^T2) <mm{C{S,Ti),C{S,T2)] . ( 2 . 22 ) 

Another application of the bottleneck inequality is to separable channels. 
These are by definition the channels with a purely classical intermediate 
stage, i.e. T = SR, where “output of S'” = “input of i?” is a classical system. 
For such channels Cq{T) = 0. 
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An important operation on channels is running two channels in parallel, 
represented mathematically by the tensor product. The relevant inequality is 



C(S, Ti 0 T 2 ) > C(5, Ti) + C{S, T 2 ) (2.23) 

for the standard ideal channels, and when all systems involved are classical, 
we even have equality. However, it is one of the big unsolved problems to 
decide under what general circumstances this is true. 

Comparison with the Classical Definition. Since the definition of clas- 
sical capacity Cc(T) also applies to the purely classical situation, we have to 
verify that it is indeed equivalent to the standard definition in this case. To 
that end, we have to evaluate the error quantity \\T — I||cb for a classical-to- 
classical channel. As noted, a classical channel T : C{Y) — s- C(X) is given by 
a transition probability matrix T(x — *■ y). Since the cb norm coincides with 
the ordinary norm in the classical case, we obtain 

p- T’llcb = P-7’11 = sup|^((5a:y - T{x y))f{y) 

= 2sup[l — T{x ^ x)~\, 

X 

where the supremum is over all / G C{Y) with \f{y)\ < 1 and is attained 
where / is just the sign of the parenthesis in the second line, and we have 
used the normalization of the transition probabilities. Hence, apart from an 
irrelevant factor of two, \\T — iPb is just the maximal probability of error, 
i.e. the largest probability for sending x and obtaining anything different. 
This is precisely the quantity which is required to go to zero (after suitable 
coding and decoding) in Shannon’s classical definition of the channel capacity 
of discrete memoryless channels [92]. Hence the above definition agrees with 
the classical one. 

When considering the classical capacity Cc(T) of a quantum channel, 
it is natural to look at a coded channel ETD as a channel in its own right. 
Since we are considering transmission of classical information, this is a purely 
classical channel, and we can look at its classical capacity. Optimizing over 
coding and decoding, we obtain the quantity 

a,i(T) = sup Cc{ETD) . (2.24) 

ETD classical 

This is called the one-shot classical capacity, because it can be said to involve 
only one invocation of the channel T. Of course, many uses of the channel 
are implicit in the capacity on the right-hand side, but these are in some 
sense harmless. In fact, every coding and decoding scheme for comparing 
(ATT))®" with an ideal classical channel is also a coding/decoding for T®", 
but the codings/decodings that arise in this way from the coding ETD are 
only those in which the coded input states and the measurements at the 
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outputs are not entangled. If we allow entanglement over blocks of a large 
length £, we thus recover the full classical capacity: 

Cc,i(T) < Cc(T) = sup i ■ (2.25) 

i « 

It is not clear whether equality holds here. This is a fundamental question, 
which can be paraphrased as follows: “Does entangled coding ever help in 
sending classical information over quantum channels?” At the moment, all 
partial results known to the author seem to suggest that this is not the case. 

Comparison with other Error Criteria. Coming now to the quantum 
capacity Cq(T), we have to relate our definition to more current definitions. 
One version, first stated by Bennett, is very similar to the one given above, 
but differs slightly in the error quantity, which is required to go to zero. 
Rather than ||T — I||cb, he considers the lowest fidelity of the channel, defined 
as 



J^(r) = mf(V>,T(|^)(V.|)^) , (2.26) 

■0 

where the supremum is over all unit vectors. Hence the achievable rates are 
those for which D) —>■ 1, where E, D map to a system of nia qubits, 

and these integer sequences satisfy the same constraints as above. This def- 
inition is equivalent to ours, because the error estimates are equivalent. In 
fact, if we introduce the off-diagonal fidelity 

T%{T) = sup5fe(0,T(|0)(^|)V^) (2.27) 

for any channel T ■. M.d ^ M.d with d < oo, we have the following system of 
estimates: 

||T-I|| < ||r-I||cb<4v'l-.F%(T) <4^/||T-I||, (2.28) 

\\T- III < 4.^1 -E{T) < 4^1 - tF%iT) , (2.29) 

which will be proved elsewhere. The main point is, though, that the dimension 
does not appear in these estimates, so if one such quantity goes to zero, all 
others do, and we can build an equivalent definition of capacity out of any 
one of them. 

Yet another definition of quantum capacity has been given in terms of 
entropy quantities [93], and has also been shown to be equivalent [94]. 

2.6.5 Coding Theorems 

The definition of channel capacity looks simple enough, but computing it on 
the basis of this definition is in general a very hard task: it involves an opti- 
mization over all coding and decoding channels in systems of asymptotically 



52 



Reinhard F. Werner 



many tensor factors. Hence it is crucial to obtain simpler expressions which 
can be computed in a much more direct way from the matrix elements of the 
given channel. Such results are called coding theorems, after the first theorem 
of this type, established by Shannon. 

In order to state this theorem, we need some entropy quantities. The von 
Neumann entropy of a state with density matrix p is defined as 

S'(p) = -tr(plogp) , (2.30) 

where the function of p on the right-hand side is evaluated in the functional 
calculus, and 0 log 0 is defined to be zero. The logarithm is chosen here as the 
logarithm to base 2, so the unit of entropy is a “bit”. The relative entropy of 
a state p with respect to another, cr, is defined by 

S'(p,ct) = tr[p(logp- logo-)]. (2.31) 

Both quantities are positive, and may be infinite on an infinite-dimensional 
space. The von Neumann entropy is concave, whereas the relative entropy 
is convex jointly in both arguments. For more precise definitions and many 
further results, I recommend the book by Ohya and Petz [95]. 

The strongest coding theorem for quantum channels known so far is the 
following expression for the one-shot classical capacity, proved by Holevo [96]; 



C'c,i(T) = max 



s{^P^T,[Pi\) -Y,P.S{T,[p.]) 



(2.32) 



Whether or not this is equal to the classical capacity depends on whether the 
conjectured equality in (2.25) holds or not. In any case, equality is known 
to hold for channels with classical input, so Holevo’s coding theorem is a 
genuine extension of Shannon’s. 

No coding theorem has been proved yet for the quantum capacity. How- 
ever, there is a fairly good candidate for the right-hand side, related to a 
quantity called “coherent information” [97]. The formula is written most 
compactly by relating it to an entanglement quantity via Lemma 2.2. For 
any bipartite state p with restriction p^ to the second factor, let 

Es{p) = S{p^) - S{p) . (2.33) 



This is a measure of entanglement of sorts, because it is large when S{p) 
is small, e.g. when p is pure, and p^ is very mixed when, for example, p is 
maximally entangled. It can be negative, though (see [98] for a discussion). 
We set 



CsAT) = sup Es[<7 o (I (g) T)] , 

<7 



(2.34) 



2 Quantum Information Theory - an Invitation 



53 



where the supremum is over all bipartite pure states a. Note that any measure 
of entanglement can be turned into a capacity-like expression by this proce- 
dure. Since this quantity is known not to be additive [99], the candidate for 
the right-hand side of the quantum coding theorem is 



in analogy to (2.25). So far there have been some good heuristic arguments 
[100, 101] in favor of this candidate, but a full proof remains one of the main 
challenges in the field. 

An interesting upper bound on Cq{T) can be written in terms of the 
transpose operation 0 on the output system [81]: we have 



Hence, if 0T happens to be completely positive (as for any channel with an 
intermediate classical state), this map is a channel; hence, it has a cb norm 
of 1, and Cq{T) = 0. This criterion can also be used to show that whenever 
there is sufficiently high noise in a channel, it will have a quantum capacity 
of zero. 

2.6.6 Teleportation and Dense Coding Schemes 

In this section we shall show that entanglement-assisted teleportation and 
dense coding, as described in Sects. 2.4.3 and 2.4.4, really work. 

Rather than going through the now standard derivations in the basic ex- 
amples involving qubits, we shall use the structure assembled so far to reverse 
the question, i.e. we try to find the most general setup in which teleporta- 
tion and dense coding work without errors. This will give some additional 
insights, and possibly some welcome flexibility when it comes to realizing 
these processes for systems larger than a qubit. The task as stated in this 
form is somewhat beyond the scope of this chapter, mainly because there are 
so many ways to waste resources, which do not necessarily have a compact 
characterization. So, in order to obtain a readable result, we look only at the 
“tight case” [102], in which resources are used, in a sense, optimally. By this 
we mean that all Hilbert spaces involved have the same finite but arbitrary 
dimension d (so we can take them all equal to Ti, = C'^), and the classical 
channel distinguishes exactly |A| = signals. 

For both teleportation and dense coding, the beginning of each trans- 
mission is the distribution of the parts of an entangled state uj between the 
sender Alice and the receiver Bob. Only then is Alice given the message she 
is supposed to send, which is a quantum state in the case of teleportation 
and a classical value in the case of dense coding. She codes this in a suitable 
way, and Bob reconstructs the original message by evaluating Alice’s signal 
jointly with his entangled subsystem. 



Cs{T) = sup ) Cs,i{T^^) , 
i t 



(2.35) 



C,(T)<log2||eT|lcb. 



(2.36) 
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For dense coding^ assume that a; G X is the message given to Alice. 
She encodes it by transforming her entangled system with a channel and 
sending the resulting quantum system to Bob, who measures an observable 
F jointly on Alice’s particle and his. The probability for obtaining y as a, 
result is then tr\ijj{Tx (8>I)(Fy)], where the “(g)I” expresses the fact that no 
transformation is applied to Bob’s particle, while Alice applies to hers. If 
everything works correctly, this expression has to be equal to 1 for x = y, 
and 0 otherwise: 

tr[u;{T,(^I){Fy)]=S,y . (2.37) 

Let us take a similar look at teleportation. Here three quantum systems are 
involved: the entangled pair in state w, and the input system given to Alice, 
in state p. Thus the overall initial state is p®uj. Alice measures an observable 
F on the first two factors, obtaining a result x, which is sent to Bob. Bob 
applies a transformation to his particle, and makes a final measurement 
of an observable A of his choice. Thus the probability of Alice measuring x 
and of Bob obtaining a result “yes” on A is tr{p^ui)[Fx ®Tx{A)]. Note that 
the tensor symbols in this equation refer to different splittings of the system 
(1 ® 23 and 12 (g) 3, respectively). Teleportation is successful if the overall 
probability of obtaining A, computed by summing over all possibilities x, is 
the same as for an ideal channel, i.e. 

^ tr(/9(g) w)[Fa; 0 Ta;(A)] = tr(pA) . (2.38) 

xGX 

Surprisingly, in the tight case one obtains exactly the same conditions on 
uj,Tx,Fx for teleportation and for dense coding, i.e. a dense-coding scheme 
can be turned into a teleportation scheme simply by letting Bob and Al- 
ice swap their equipment. However, this symmetry depends crucially on the 
tightness condition, because teleportation schemes with |X| > d^ signals are 
trivial to achieve, but |A| > is impossible for dense coding. Conversely, 
dense coding through a d' > d-dimensional channel is trivial to achieve, while 
teleportation of states with d' > d dimensions (with the same X) is impossi- 
ble. 

Let us now give a heuristic sketch of the arguments leading to the neces- 
sary and sufficient conditions for (2.37) and (2.38) to hold. For full proofs we 
refer to [102] . A crucial ingredient in the analysis of the teleportation equation 
is the “no measurement without perturbation” principle from Lemma 2.1: the 
left-hand side of (2.38) is indeed such a decomposition, so each term must be 
equal to \xtr{pA) for all p, A. But we can carry this even further: suppose 
we decompose uj, F^ or into a sum of (completely) positive terms. Then 
each term in the resulting sum must also be proportional to tr(pA). Hence 
any components of w,Ta, or Fx satisfy a teleportation equation as well (up 
to normalization). Similarly, the vanishing of the dense-coding equation for 
X ^ y carries over to every positive summand in w,Ta; or F^. Hence it is 
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plausible that we must first analyze the case where all oj^F^^Tx are “pure”, 
i.e. have no nontrivial decompositions as sums of (completely) positive terms: 

w=|I2)(I2|, (2.39) 

Fx = (2.40) 

Tx{A) = U*AUx . (2.41) 

The further analysis will show that in the pure case any two of these elements 
determine the third via the teleportation or the dense-coding equation, so 
that in fact all components of lo (and correspondingly Tx or Fx) have to be 
proportional. Hence each of these has to be pure in the first place. For the 
present discussion, let us just assume purity in the form (2.39)-(2.41) from 
now on. Note that normalization requires that each Ux is unitary. 

The second normalization condition, '^x\'^x){^x\ = = I, has 

an interesting consequence in conjunction with the tightness condition: the 
vectors <Px live in a d^-dimensional space, and there are exactly cP of them. 
This implies that they are orthogonal: since each vector <l>x satisfies ||<?x|| < 1, 
and d = tr(I) = we must have ||<?x|| = 1 for all x. Hence, in the 

sum 1 = 'Yhxi^VT^^^y) term y = x is equal to 1, and hence the others 

must be be zero. 

Now consider the term with index x in the teleportation equation and set 
P = W){4'\ and A = |'0)('!/)'|. Then the trace splits into two scalar products, 
in which the variables can be chosen independently, which leads 

to an equation of the form 

(<). 0 I?, 0 (C/^V')) = V') , (2.42) 

for all and to coefficients which must satisfy = 1- Note how 

in this equation a scalar product between the vectors in the first and third 
tensor factors is generated. This type of equation, which is clearly the core 
of the teleportation process, may be solved in general: 

Lemma 2.3. Let Ti.,JC be finite- dimensional Hilbert spaces, and let fl\ S 
/C 0 and 122 G 0 /C be unit vectors such that, for all € H, 

(</) 0 I2i, 172 0 = A(((), V') ■ (2.43) 

Then |A| < 1/dim 7i, with equality iff I7i and Q 2 «ce maximally entangled 
and equal up to the exchange of the tensor factors TL and JC. 

For the proof, consider the Schmidt decomposition I7i = \/wkfk®ek, 
and insert f = Cn, ip = Cm into (2.43) to find the matrix elements of 172: 

(Cn 0 /*m; ^ 2 } — A Wjj, ^ djijri • 

Clearly, ||I72|p = takes its smallest value under the 

constraint ~ ll‘*2i|P = 1 only at the point where all Wm are equal. 

This proves the lemma. 
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We apply this lemma to l7i = (I ® Ux)f^ and 17 = Then |AxP < 
(Pd~^ = 1, with equality only if all the vectors involved are maximally en- 
tangled and are pairwise equal up to an exchange of factors: 

= {Ux 0 I)f2 , (2.44) 

where we take 17 = ® et by an appropriate choice of bases. If 

17 is maximally entangled, (2.44) sets up a one-to-one correspondence be- 
tween unitary operators Ux and the vectors d>x, as independent elements in 
the construction. The have to be an orthonormal basis of maximally en- 
tangled vectors, and there are no further constraints. In terms of the Ux, 
the orthogonality of the d>x translates into orthogonality with respect to the 
Hilbert-Schmidt scalar product: 

tr{U*Uy) = d Sxy . (2.45) 

Again, there are no further constraints, so any collection of d^ unitaries sat- 
isfying these equations leads to a teleportation scheme. 

For the dense-coding case we obtain the same result, although by a differ- 
ent route. Equation (2.44) follows easily if we write the teleportation equation 
as 1(17, {U* ® T)d>x)\'^ = 5xy The problem is to show that 17 has to be max- 
imally entangled. Using the reduced density operator wi of uj, this becomes 

U{0JiUlUy) = (17, {UlUy ® 1)17) = {^x,<^v) = Sxy . (2.46) 

We claim that this equation, for a positive operator wi and unitaries Ux, 
implies that wi = d~^l. To see this, expand the operator A = \4>) {ek\oJi^ in 
the basis Ux according to the formula A = Uxtr(U*Auji): 

J2(^k,u:c^} Ux = l<P)(eklcur^ . 

X 

Taking the matrix element {cj)\ ■ \ek) of this equation and summing over k, we 
find 

Y^{ek,U*xCjS) {cj>,Uxek) = ^ tr(C/:|<^)(</)|C/,) = = ||^f tr(a;-i) . 

x,k X 

Hence tr(wf^) = d? = where rk are the eigenvalues of u>i. Us- 

ing again the fact that the smallest value of this sum under the constraint 
Sfc = 1 is attained only for constant r^, we find wi = d“^I, and 17 is 
indeed maximally entangled. 

To summarize, we have the following theorem (again, for a detailed proof 
see [102]): 

Theorem 2.3. Given either a teleportation scheme or a dense-coding scheme, 
which is tight in the sense that all Hilbert spaces are d-dimensional and 
jXj = d? classical signals are distinguished, then 
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• u> = \ is pure and maximally entangled, 

• Fx = \‘l’x){'^x\, where the ‘l>x form an orthonormal basis of maximally 
entangled vectors, 

• Tx{A) = UfAUx, where the Ux are unitary and orthonormal in the sense 
that tr{U*Uy) = d Sxy, and 

• these objects are connected by the equation <l>x = (Ux ® I) 17. 

Given either the or the Ux with the appropriate orthogonality properties, 
and a maximally entangled vector fl, the above conditions determine a dense 
coding and a teleportation scheme. 

In particular, we have shown that a teleportation scheme becomes a dense- 
coding scheme and vice versa, when Alice and Bob swap their equipment. 
However, this is only true in the tight case: for a larger quantum channel, 
dense coding becomes easier but teleportation becomes more demanding. 
Similarly, teleportation becomes easier with more allowed classical informa- 
tion exchange, whereas dense coding of more than d^ signals is impossible. 

In order to construct a scheme, it is best to start from the equation 
tifUfUy) = d Sxy, i.e. to look for orthonormal bases in the space of operators 
consisting of unitaries. For d=2 the solution is essentially unique: Ui, ... , 1/4 
are the identity and the three Pauli matrices, which leads to the standard 
examples. Group theory helps to construct examples of such bases for any 
dimension d, but this construction by no means exhausts the possibilities. 
A fairly general construction is given in [102]. It requires two combinatorial 
structures known from classical design theory [103, 104]: a Latin square of 
order d, i.e. a matrix in which each row and column is a permutation of 
(1, ..., d), and d Hadamard matrices , i.e. unitary dxd matrices, in which each 
entry has modulus dr^!^ . For neither Latin squares nor Hadamard matrices 
does an exhaustive construction exist, so these are rich fields for hunting 
and gathering new examples, or even infinite families of examples. Certainly, 
this connection suggests that a full classification or exhaustive construction 
of teleportation and dense-coding schemes cannot be expected. However, it 
may still be a good project to look for schemes with additional desirable 
features. 
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Quantum entanglement lies at the heart of the new field of quantum com- 
munication and computation. For a long time, entanglement was seen just as 
one of those fancy features which make quantum mechanics so counterintu- 
itive. But recently, quantum information theory has shown the tremendous 
importance of quantum correlations for the formulation of new methods of 
information transfer and for algorithms exploiting the capabilities of quan- 
tum computers. While the latter needs entanglement between a large number 
of quantum systems, the basic quantum communication schemes rely only on 
entanglement between the members of a pair of particles, directly pointing 
to a possible realization of such schemes by means of correlated photon pairs 
such as those produced by parametric down-conversion. 

This chapter describes the first experimental realizations of quantum com- 
munication schemes using entangled photon pairs. We show how to make com- 
munication secure against eavesdropping using entanglement-based quantum 
cryptography, how to increase the information capacity of a quantum chan- 
nel by quantum dense coding and, finally, how to communicate quantum 
information itself in the process of quantum teleportation. 

3.1 Introduction 

Quantum mechanics is probably the most successful physical theory of this 
century. It provides powerful tools which form one of the cornerstones of 
scientific progress, and which are indispensable for the understanding of om- 
nipresent technical devices such as the transistor, semiconductor chips and 
the laser. The most important areas where those devices are used are mod- 
ern communication and information-processing technologies. But quantum 
mechanics, until now, has only been used to construct these devices - quan- 
tum effects are absolutely avoided in the representation and manipulation 
of information. Rather than using single photons, one still uses strong light 
pulses to send information along optical high-speed connections, and one re- 
lies on electrical currents in semiconductor logic chips instead of applying 
single electrons as signal carriers. 

This caution surely is due to the fact that, at first glance, the inherent 
stochastic character of quantum effects seems only to introduce unavoidable 
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noise and thus does not really recommend their use. Yet quantum informa- 
tion theory shows us, in more and more examples, how one can profit from 
the peculiar properties of quantum systems, and, when applied correctly, 
how fundamental quantum effects can add to the power and features of clas- 
sical information processing and transmission [12, 105, 106]. For example, 
quantum computers will outperform conventional computers, and quantum 
cryptography enables, for the first time, secure communication. While quan- 
tum cryptography, in principle, can be performed even with single quantum 
particles, all the other proposals utilize entanglement between two or more 
particles, for example to enhance communication rates or to enable the tele- 
portation of quantum states. 

Entanglement between quantum systems is a pure quantum effect. It is 
closely related to the superposition principle and describes correlations be- 
tween quantum systems that are much stronger and richer than any classical 
correlation could be. Originally this property was introduced by Einstein, 
Podolsky and Rosen (EPR) [24], and also by Schrodinger [5] and Bohr [107] 
in the discussion of the completeness of quantum mechanics and by von Neu- 
mann [108] in his description of the measurement process. Entanglement also 
provides a handle to distinguish various interpretations of quantum mechan- 
ics via Bell’s theorem [72, 109] or the GHZ argument [37]. The development 
of experimental techniques has enabled researchers to perform the recent 
long-distance tests of entanglement [30], the first Bell experiment fulfilling 
Einstein locality conditions [31] and the first GHZ experiment [38], which 
all provided convincing demonstrations of the validity of standard quantum 
mechanics.^ 

The field of quantum information is not concerned with the fundamental 
issues. Instead, it builds on the validity of quantum mechanics and applies the 
characteristic features of entangled systems to devise new, powerful schemes 
for communication and computation. Entanglement between a large number 
of quantum systems will enable very efficient computations. In particular, 
the factorization algorithm of Shor [9] and the search algorithm of Grover 
[58] (together with the increasing number of algorithms derived from one or 
the other) show how entanglement and the associated interference between 
entangled states can boost the power of quantum computers. 

Quantum communication exploits entanglement between only two or three 
particles. As will be seen in the following sections, the often counterintu- 
itive features of such small entangled systems enable powerful communication 
methods. After the very basic properties of pairs of entangled particles have 
been introduced (Sect. 3.2), Sect. 3.3 gives an overview of the possibilities 
of three important quantum communication schemes: entanglement-based 
quantum cryptography enables secret key exchange and thus truly secure 
communication [16]; using quantum dense coding, one can send classical in- 

^ We are aware of the detection loophole [110], which will be closed whenever 
technology allows. 
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formation more efficiently [51]; and, finally, with quantum teleportation one 
can transfer quantum information, that is, the quantum state itself, from 
one quantum system to another [52]. The tools for the experimental real- 
ization of those quantum communication schemes are presented in Sect. 3.4. 
In particular, we show how to produce polarization-entangled photon pairs 
by parametric down-conversion [111] and how to observe these nonclassical 
states by interferometric Bell-state analysis [112]. In Sect. 3.5 we describe the 
first experimental realizations of basic quantum communication schemes. In 
experiments performed during recent years at the University of Innsbruck, 
we could realize entanglement-based quantum cryptography with randomly 
switched analyzers and with the two users separated by more than 400 m 
[113]; we demonstrated the possibility of transmitting 1.58 bits of classical 
information by encoding trits on a single two-state photon [114]; and we 
could transfer a qubit, in our case the polarization state, from one photon to 
another by quantum teleportation [10, 11] and entanglement swapping [115]. 

3.2 Entanglement — Basic Features 

For a long time, entanglement was seen merely as one of the counterintuitive 
features of quantum mechanics, important only within the realm of the EPR 
paradox. Only lately has the field of quantum information exploited these 
features to obtain new types of information transmission and processing. 
Recent literature now offers a thorough discussion of all the various properties 
of entangled systems [37, 72, 116] (see also Chap. 5); in this review, we 
concentrate on those features which form the foundation of the basic quantum 
communication schemes. 

At the heart of entanglement lies another fundamental feature of quantum 
mechanics, the superposition principle. If we look at a classical, two-valued 
system, for example a coin, we find it in either one of its two possible states, 
that is, either head or tail. Its quantum mechanical counterpart, a two-state 
quantum system, however, can be found in any superposition of two possible 
basis states, e.g. JtF) = (1/V2)(j0) -I- ]1)). Here we denote the two orthogonal 
basis states by jO) and jl), respectively.^ This generic notation can stand 
for any of the properties of various two-state systems, for example for the 
ground state ]g) and excited state ]e) of an atom, or, as is the case in our 
experiments, for the horizontal polarization \H) and vertical polarization \ V) 
of a photon. 

In the classical world, we find two coins to be in the states of either 
head/head, head/tail, tail/head or tail/tail, and we can identify these four 
possibilities with the four quantum states 

|0)i|0)2, |0)i]l)2, |l)i|0)2 and ll)i]l)2 , 

^ This notation should not be confused with the description of an electromagnetic 
field (vacuum or single-photon state) in second quantization. Here we use only 
the notions of first quantization to describe the properties of two-state systems. 
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describing two two-state quantum systems. As the superposition principle 
holds for more than one quantum system, the two quantum particles are no 
longer restricted to the four “classical” basis states above, but can be in any 
superposition thereof, for example in the entangled state 

|^) = ^(|0)i|0)2+|l)l|l)2) . (3.1) 

Of course, one is restricted neither to only two particles nor to such 
maximally entangled states. During the last decade, enormous progress was 
achieved in the theoretical studies of the quantum features of multiparti- 
cle systems. One will observe even more stunning correlations between three 
or more entangled particles [37, 117]; one can generalize to the observation 
of interference and entanglement between multistate particles [118] and to 
entanglement for mixed states. There also exists the possibility to purify en- 
tanglement [119], and one can even find two-particle systems which are not 
actually entangled, but are such that a local observer cannot distinguish them 
from entangled states [120]. For the basic quantum communication schemes 
and experiments, we can concentrate on the particular properties of max- 
imally entangled two-particle systems. Considering two two-state particles, 
we find a basis of four orthogonal, maximally entangled states, the so called 



Bell-states basis: 

lS^+)l2 = ^(|0)lll)2+|l)l|0)2), (3.2) 

l^-)l2 = ^(|0)lll)2-|l)l|0)2), (3.3) 

1^>+)i2 = ^(!0)i10)2 + 11)i]1)2), (3.4) 

l<?-)l2 = ^(10)ilO)2-ll)l]l)2). (3.5) 



The name “Bell states” was given to these states since they maximally 
violate a Bell inequality [121]. This inequality was deduced in the context of 
so-called local realistic theories (see Chap. 1), and gives a range of possible 
results for certain statistical tests on identically prepared pairs of particles 
[109]. Quantum mechanics predicts different results if the measurements are 
performed on entangled pairs. If the two particles are not correlated, i.e. are 
described by a product state, the quantum mechanical prediction is within 
the range given by Bell’s inequality. 

The remarkably nonclassical features of entangled pairs arise from the 
fact that the two systems can no longer be seen as being independent but 
now have to be seen as one combined system, where the observation of one of 
the two will change the possible predictions of measurement results obtained 
for the other [5, 107]. Formally, this mutual dependence is reflected by the 
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fact that the entangled state can no longer be factored into a product of two 
states for the two subsystems separately. 

If one looks only at one of the two particles, one finds it with equal proba- 
bility in state |0) or in state |1). One has no information about the particular 
outcome of a measurement to be performed. However, the observation of 
one of the two particles determines the result of a measurement of the other 
particle. This holds not only for a measurement in the basis |0)/|1), but for 
any arbitrary superposition, that is, for any arbitrary orientation of the mea- 
surement apparatus. In particular, for the state we shall find the two 
particles always in orthogonal states, no matter which measurement appa- 
ratus is used. If, for the case of polarization-entangled photons, we observe 
only one of the two photons, it appears to be completely unpolarized, and 
any polarization direction is observed with equal probability. However, the 
results for both photons are perfectly correlated. For example, this means 
that photon 2 has vertical polarization if we found horizontal polarization 
for photon 1, but also that photon 2 will be circularly polarized left if we 
observed right circular polarization for photon 1. 

Another important feature of the four Bell states is that a manipulation 
of only one of the two particles suffices to transform from any Bell state to 
any of the other three states. This is not possible for the basis formed by the 
products. For example, to transform |0)i|0)2 into |l)i|l )2 one has to flip the 
state of both particles. 

These three features, 

• different statistical results for measurements on entangled or unentangled 
pairs 

• perfect correlations between the observations of the two particles of a pair, 
although the results of the measurements on the individual particles are 
fully random 

• the possibility to transform between the Bell states by manipulating only 
one of the two particles, 

are the ingredients of the fundamental quantum communication schemes de- 
scribed here. 



3.3 The Quantum Communication Schemes 

Quantum communication methods utilize fundamental properties of quan- 
tum mechanics to enhance the power and potential of today’s communica- 
tion systems. The first step towards quantum information processing is the 
generalization of classical digital encoding, which uses the bit values “0” and 
“1” . Quantum information associates two distinguishable, orthogonal states 
of a two-state system with these bit values. We thus directly translate the 
two values of a classical bit to the two basis states |0) and |1). 
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As an extension to the situation for classical communication, the quantum 
system can be in any superposition of the two basis states. To distinguish such 
a quantum state and the information contained in it from a classical bit, such 
a state is called a “qubit” [69] . The general state of a qubit is 



|<f')=ao|0)+ai|l), (3.6) 

where Oq and a\ are complex amplitudes (with jaoj^ + |ai|^ = 1). 

A measurement of the qubit projects the state onto either |0) or |1) and 
therefore cannot give the full quantum information about the state. Evi- 
dently, if we want to communicate information, we have to restrict ourselves 
to sending only basis states in order to avoid errors, and thus only one bit 
of classical information can be sent with a single qubit. Therefore, the new 
features do not seem to offer additional power. However, by provoking errors, 
quantum cryptography [42, 122] enables one to check the security of quantum 
key generation. The security of quantum cryptography relies on the fact that 
an eavesdropper cannot unambiguously read the state of a single quantum 
particle which is transferred from Alice to Bob. 

When two-particle systems are used, entanglement adds many more fea- 
tures to the capabilities of quantum communication systems compared with 
classical systems. In recent years, several proposals have shown how to ex- 
ploit the basic features of entangled states in new quantum communication 
schemes. In the following we shall see how entangled pairs enable a new for- 
mulation of quantum cryptography (Sect. 3.3.1), how we can surpass the limit 
of transmitting only one bit per qubit (Sect. 3.3.2) and how entanglement 
allows one to transfer quantum information from one particle to another in 
the process of quantum teleportation (Sect. 3.3.3). 

3.3.1 Quantum Cryptography 

Suppose two parties, let us call them Alice and Bob, want to send each 
other secret messages. There exists a cryptographic method, the one-time 
pad scheme,^ which is secure against eavesdropping attacks - provided the 
key used for encoding and decoding the message is perfectly random, is as 
long as the original message and, most importantly, is secret and known 
only to Alice and Bob. But how can they be sure that the key was securely 
distributed to the two, and that no third person has knowledge about the key? 
Quantum cryptography [42, 122] provides a means to ensure the security of 

^ In the so-called “one-time pad” encryption (see Sect. 3.1), every character of the 
message is encoded with a random key character. As shown by Shannon [44], the 
cipher cannot be decoded without a knowledge of the key. The eavesdropping 
is impossible as long as the key is securely exchanged between the sender and 
receiver. 
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EPR-source 



Fig. 3.1. Scheme for entanglement-based quantum cryptography [46] 



the key distribution and thus enables, together with the one-time pad scheme, 
absolutely secret communication.^ 

Let us first discuss how quantum cryptography can profit from the fas- 
cinating properties of entangled systems to provide secure key exchange 
[46, 123]. Suppose that Alice and Bob receive particles which are in entan- 
gled pairs, from an EPR source (Fig. 3.1). Beforehand, Alice and Bob agreed 
on some preferred basis, again called [0)/[l), in which they start to perform 
measurements. The possible results, -1-1 and —1, correspond to observation 
of the state [1) or [0), respectively. Owing to the entanglement of the parti- 
cles, the measurement results of Alice and Bob will be perfectly correlated 
or, in a case where the source produces pairs in the \'P~) state, perfectly an- 
ticorrelated. For each instance where Alice obtained —1, she knows that Bob 
observed -1-1, and if she obtained the result -1-1, she knows that Bob had — 1. 
Alice and Bob can translate the result —1 to the bit value 0 and the result 
-1-1 to the bit value 1 and thereby establish a random key, ideal for encoding 
messages. But how can they be sure that no eavesdropper has intercepted 
the key exchange? There are two different techniques. The first scheme for 
entanglement-based quantum cryptography [123] builds on the ideas of the 
basic quantum cryptography protocol for single photons [42, 122]. In this 
case, Alice and Bob randomly and independently vary their analysis direc- 
tions between 0°, corresponding to the |0)/[1) basis, and 45°, corresponding 
to a second, noncommuting basis. They will observe perfect anticorrelations 
of their measurements whenever they happen to have polarizers oriented par- 
allel (Alice and Bob thus obtain identical keys, if one of them inverts all bits 
of his/her key string). This can be viewed in the following way: as Alice makes 



For descriptions of quantum cryptography schemes not relying on entanglement, 
see [105, 106]. 



4 



3. Quantum Communication 



65 



a measurement on photon A she projects photon B into the orthogonal state, 
which is then analyzed by Bob. An eavesdropper, not knowing the actual ba- 
sis, causes errors, since he/she cannot determine the quantum state without 
information about the basis. Thus, Alice and Bob can find out, by communi- 
cation over a classical, public channel, whether or not their key exchange has 
been attacked by checking whether or not some of the key bits are different. 
Of course, those test key bits cannot be used for secure communication and 
have to be sacrificed. 

The other technique uses the fragility of entanglement against measure- 
ments. Any attack an eavesdropper might perform reduces the entanglement 
and allows Alice and Bob to check the security of their quantum key ex- 
change. As described in Sect. 3.2, measurements on entangled pairs obey 
statistical correlations and will violate a Bell inequality. It can be shown that 
the more knowledge the eavesdropper has gained when he/she intercepted 
the key exchange, the less the inequality is violated. The amount by which 
a Bell inequality is violated is thus an ideal measure of the security of the 
key. Alice and Bob therefore measure the entangled particles not only in the 
basis |0)/|1), but also along some other directions, depending on the Bell 
inequality used. A particularly simple form of Bell inequality, which is well 
suited for experimental application, is the version deduced by Wigner [28], 
which can be used as follows. 

Alice chooses between two polarization measurements of photon A, either 
along the axis a or along the axis (3, and Bob chooses between measurements 
along P and 7 of photon B. We identify the direction /3, which is common 
to the two users, with our preferred basis |0)/|1). A detected polarization 
parallel to the analyzer axis corresponds to a -1-1 result, and a polarization 
orthogonal to the analyzer axis corresponds to —1. If, heretically, one assumes 
that every photon carries preassigned values determining the outcomes of the 
measurements on each of the photon pairs, it follows that the probabilities 
of obtaining -|-1 on both sides, p++, must obey Wigner ’s inequality: 

p++{aA, Pb) + p++{Pa,'1b) -p++(q!a,7b) > 0 . (3.7) 

The quantum mechanical prediction P4.™ for these probabilities with some 
arbitrary analyzer settings 6>a (Alice) and 6>b (Bob) and measurement of the 

state is 



The analyzer settings a = —30°, /3 = 0° and 7 = 30° lead to a maximum 
violation of Wigner ’s inequality (3.7): 



P++(6>a, 6>b) = i sin^ (6 >a - Ob) ■ 



(3.8) 
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which is not greater than or equal to 0. 
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In order to implement quantum key distribution, Alice and Bob each vary 
their analyzers randomly between two settings: Alice uses —30°, 0°, and Bob 
uses 0° , 30° . Because Alice and Bob operate independently, four possible com- 
binations of analyzer settings will occur, of which the three oblique settings 
allow a test of Wigner’s inequality and the remaining combination of parallel 
settings allows the generation of keys via the perfect anticorrelation (where, 
again, either Alice or Bob has to invert all bits of the key to obtain identical 
keys). If the measured probabilities violate Wigner’s inequality, the security 
of the quantum channel is ascertained, and the keys generated can readily 
be used. This scheme is an improvement on the Ekert scheme, which uses 
the CHSH inequality. Since there are fewer settings on each side, the above 
version is technically easier to implement and also uses the photon pairs more 
efficiently for key generation. 

Compared with standard attenuated-pulse quantum cryptography, such 
systems are practically immune to any beam-splitter attack (or other attacks 
that try to split pulses containing more than one photon) by a potential 
eavesdropper. First of all, a photon pair source can be used as an (almost) 
ideal source of single photons. If one of the photons is detected, the gate 
time of the coincidence electronics (typically on the order of 1 ns) determines 
the equivalent pulse duration in standard quantum cryptography. Since the 
probability of generating one photon pair during such a short time is very 
low, e.g., for the experiment described in Sect. 3.5.1, only about 6.8 x 10“"^, 
the probability of having two photons in the gate time is less than 3 x 10“^ 
and can be almost neglected. This has to be compared with a probability of 
having two photons in a pulse of 0.005 for a typical quantum cryptography 
realization using a mean of 0.1 photons per pulse. 

However, the security against beam-splitting attacks can be further in- 
creased when entanglement-based schemes are used. In this case, there is 
only a correlation between two entangled pairs if they are simultaneously 
generated during a time interval of the order of the coherence time of the 
photons, i.e. during a time of typically 500 fs. This reduces the chances of 
an eavesdropper learning the value of a key bit to about 6 x 10“^"^ and guar- 
antees unprecedented security of the quantum key. Moreover, by utilizing 
the peculiar properties of entangled photon pairs produced by parametric 
down-conversion, one immediately profits from the inherent randomness of 
quantum mechanical observations, which guarantees a truly random and non- 
deterministic key. 

3.3.2 Quantum Dense Coding 

If one wants to send some information, one encodes the message with dis- 
tinguishable symbols, writes them on some physical entity and finally, this is 
transmitted to the receiver. To send one bit of information one uses, for exam- 
ple, the binary values “0” and “1” as code symbols written on the information 
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classical 

information 




Fig. 3.2. Scheme for the efficient transmission of classical information by quantum 
dense coding [51] (BSM, Bell-state measurement; U, unitary transformation) 



carrier. If one wants to send two bits of information, one consequently has to 
perform the process twice; that means one has to send two such entities. 

As mentioned above, in the case of quantum information one identifies 
the two binary values with the two orthogonal basis states |0) and |1) of 
the qubit. In order to send a classical message to Bob, Alice uses quantum 
particles, all prepared in the same state by some source. Alice translates the 
bit values of the message by either leaving the state of the qubit unchanged or 
flipping it to the other, orthogonal state, and Bob, consequently, will observe 
the particle in one or the other state. That means that Alice can encode one 
bit of information in a single qubit. Obviously, she cannot do better, since in 
order to avoid errors, the states arriving at Bob have to be distinguishable, 
which is only guaranteed when orthogonal states are used. In this respect, 
they do not gain anything by using qubits as compared with classical bits. 
Also, if she wants to communicate two bits of information, Alice has to send 
two qubits. 

Bennett and Wiesner found a clever way to circumvent the classical limit 
and showed how to increase the channel capacity by utilizing entangled par- 
ticles [51]. Suppose the particle which Alice obtained from the source is en- 
tangled with another particle, which was sent directly to Bob (Fig. 3.2). The 
two particles are in one of the four Bell states, say |'F~). Alice now can use a 
particular feature of the Bell basis, that manipulation of one of the two entan- 
gled particles suffices to transform to any other of the four Bell states. Thus 
she can perform one out of four possible transformations ~ that is, doing 
nothing, shifting the phase by tt, flipping the state, or flipping and phase- 
shifting the state - to transform the two-particle state of their common pair 
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to another state. After Alice has sent the transformed two-state particle to 
Bob, he can read the information by performing a combined measurement on 
both particles. He makes a measurement in the Bell-state basis and can iden- 
tify which of the four possible messages was sent by Alice. Thus it is possible 
to encode two bits of classical information by manipulating and transmit- 
ting a single two-state system. Entanglement enables one to communicate 
information more efficiently than any classical system could do. 

The preceding examples show how quantum information can be applied 
for secure and efficient transmission of classical information. But can one 
also transmit quantum information, that is, the state of a qubit? Obviously, 
quantum mechanics places a number of obstacles in the way of this intention, 
above all, the problem of measuring quantum states, which is utilized in 
quantum cryptography as already described. 

3.3.3 Quantum Teleportation 

The Idea. It is an everyday task, in our classical world, for Alice to send 
some information to Bob. Imagine a fax machine. Alice might have some 
message, written on a sheet of paper. For the fax machine the actual written 
information does not matter, in fact, it reduces to just a sequence of white 
and black pixels. For the transmission, the machine scans the paper pixel by 
pixel. It measures whether a pixel is white or black and sends this information 
to Bob’s machine, which writes the state of each pixel onto another sheet of 
paper. In classical physics, by definition, one can make the measurements with 
arbitrary precision, and Bob’s sheet can thus become an ideal copy of Alice’s 
original sheet of paper. If Alice’s pixels were made smaller and smaller, they 
would, in reality, sooner or later be encoded on single molecules or atoms. 
If we again confined ourselves to coding in only the basis states, we surely 
could measure and transfer the binary value of even such pixels. 

Now, imagine Alice not only has classical binary values encoded on her 
system, but wants to send a quantum state, i.e. quantum information, to 
Bob. She has a qubit encoded on some quantum system such as a molecule 
or atom, and wishes, that a quantum system in Bob’s hands should represent 
this qubit at the end of the transmission. Evidently, Alice cannot read the 
quantum information, that is, measure the state of the quantum object with 
arbitrary precision. All she would learn from her measurement would be that 
the amplitude of the observed basis state was not zero. But this is not enough 
information for Bob to reconstruct the qubit on his quantum particle. 

Another limitation, which definitely seems to bring the quest for perfect 
transfer of the quantum information to an end, is the no-cloning theorem 
(see Sect. 3.1) [4]. According to this theorem, the state of a quantum system 
cannot be copied onto another quantum system with arbitrary precision. 
Thus, how could Bob’s quantum particle obtain the state of Alice’s particle? 

In 1993 Bennett et al. found the solution to this problem [.52]. In their 
scheme, a chain of quantum correlations is established between the particle 
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Fig. 3.3. Scheme for teleporting a qnantum state from one system to another one 
[52] 



carrying the initial quantum state and Bob’s particle. They dispense with 
measuring the initial state; actually, they avoid gaining any knowledge about 
this state at all! 

To perform quantum teleportation, initially Alice and Bob share an en- 
tangled pair of particles 2 and 3, which they have obtained from some source 
of entangled particles, in, say, the state (Fig. 3.3). As mentioned be- 

fore, we cannot say anything about the state of particle 2 on its own. Nor do 
we know the state of particle 3. In fact, these particles do not have a (pure) 
state at all. But, whatever the results of measurements might be, we know 
for sure that they are orthogonal to each other. Next, particle 1, which car- 
ries the state to be sent to Bob, is given to Alice. She now measures particle 
1 and 2 together, by projecting them onto the Bell-state basis. After pro- 
jecting the two particles into an entangled state, she cannot infer anything 
about the individual states of particles 1 and 2 anymore. However, she knows 
about correlations between the two. Let us assume she has obtained the result 
This tells her, that whatever the two states of particles 1 and 2 have 
been, they have been orthogonal to each other. But from this, Alice already 
knows that the state of particle 3 is equal to the state of particle 1 (up to a 
possible overall phase shift). This follows because the state of particle 1 was 
orthogonal to 2 and, owing to the preparation of particles 2 and 3, the state 
of particle 2 was orthogonal to 3. All Alice has to do is to tell this to Bob, 
to let him know that, in this particular case, the state of his particle 3 is the 
same as that which particle 1 had initially. 

Of course, since there are four orthogonal Bell states, there are four equally 
probable outcomes for Alice’s Bell-state measurement. If Alice has obtained 
another result, the state of Bob’s particle is again related to the initial state 
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of particle 1, up to a characteristic unitary transformation. This stems from 
the fact that a unitary transformation of one of two entangled particles can 
transform from any Bell state to any other. 

Therefore, Alice has to send the result of her Bell-state measurement 
(i.e. a number between 0 and 3, or, equivalently, two bits of information) 
via a classical communication channel to Bob. He then can restore the initial 
quantum state of particle 1 on his particle 3 by performing the correct unitary 
transformation . 

Formally, we describe the initial state of particle 1 by |x)i = a\H)i+b\V)i, 
and the state of the EPR pair 2 and 3 by Then the joint three-photon 

system is in the product state 

|^')l,2,3 = |x)l ® |S'~)2,3, (3.10) 

which can be decomposed into 

|^)l,2,3 = 2 [ (o|^)3 + ^ 1 ^) 3 ) — {o,\H) 3 — b\V)3) (3.11) 

+ |^>-)i,2 (a|P)3 + b\H)3) + \$+)3,2 (a|l")3 - blH)^) ] . 

One easily sees that after observation of particles 1 and 2 in one of the 
four Bell states, the corresponding unitary transformation enables Bob to 
transfer the initial state of particle 1 to particle 3. 

Some Remarks. The principle of quantum teleportation incorporates all 
the characteristic features of entangled systems, and, in an astounding man- 
ner, profits from the obstacles seemingly imposed by quantum mechanics. It 
should be emphasized that quantum teleportation is well within the concepts 
of conventional physics and quantum mechanics. Let us briefly discuss a few 
not infrequently occurring misunderstandings. 

First, the no-cloning theorem is not violated. The state of particle 1 can 
only be restored on particle 3 if the measurement performed by Alice does not 
give any information about the state! After Alice’s Bell-state measurement, 
particle 1 is in a mixed state, which is absolutely uncorrelated with the initial 
state of particle 1. Therefore, the particular quantum state which is teleported 
can be attributed to only one particle at a time, never to two. 

Secondly, there is no faster-than-light communication achieved in quan- 
tum teleportation. Even if Alice knows, right after her measurement, that 
Bob’s particle is already either in the correct state or in one of the three 
other possible states, she has to send this information to Bob. The classical 
information sent to Bob is transmitted, according to the theory of relativ- 
ity, at the speed of light at maximum. Only after receiving the result and 
after performing the correct unitary transformation can Bob restore the ini- 
tial quantum state. If Bob does not know the result of Alice’s measurement, 
his particle is in a mixed state, which is not correlated at all with the ini- 
tial state. Thus quantum information, the qubit, cannot be transferred faster 
than classical information. 
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Fig. 3.4. Scheme for entangling particles that have never interacted by the process 
of entanglement swapping [125] 



And, thirdly, there is also no transfer of matter or energy (other than that 
required for the transmission of classical information). All that makes up a 
particle are its properties, described by the quantum state. For example, the 
state of a free neutron defines its momentum and its spin. If one transfers 
the state onto another neutron, this particle obtains all the properties of the 
first one; in fact, it becomes the initial particle. We leave it to the science 
fiction writers to apply the scheme to bigger and bigger objects. Whether or 
not this idea will help some Captain Kirk to get back to his space ship or 
not cannot be answered here. Certainly, a lot of other problems need to be 
solved as well.^ 

It is appropriate to point out some generalizations of the principle of 
quantum teleportation. It is not necessary that the initial state which is to 
be teleported is a pure state. In fact it can be any mixed state, or even the 
undefined state of an entangled particle. This is best demonstrated by entan- 
glement swapping [125]. Here, the particle to be teleported (1) is entangled 
with yet another one (4) (Fig. 3.4). The state of particle 1 on its own is a 
mixed state; however, it can be determined by the observation of particle 4. 
Quantum teleportation allows us to transfer the state of particle 1 onto par- 
ticle 3. Since quantum teleportation works for any arbitrary quantum state, 
particle 3 thus becomes entangled with particle 4. Note, that particles 3 and 4 
do not come from the same source, nor did they ever interact with each other. 
Nevertheless, it is possible to entangle them by swapping the entanglement 
in the process of quantum teleportation. 

® The “technical manuals of Star TreK’ mention, as a necessary part of their trans- 
porter, a “Heisenberg compensator” [124]. Quantum teleportation seems to pro- 
vide a solution for this marvelous device. However, a lot more is necessary to 
beam large objects. 
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Fig. 3.5. Remote state preparation of Bob’s particle 2, by a manipulation (M) of 
particle 1 



Quantum teleportation is not confined to transferring two-state quantum 
systems. If Alice and Bob share an entangled pair of N-state particles, they 
can teleport the state of an N-dimensional quantum system [126]. As before, 
Alice performs projection onto the N^-dimensional basis of entangled states 
spanning the product space of particles 1 and 2. The result, one out of 
equally probable results, has to be communicated to Bob, who then can again 
restore the initial state of particle 1 by the corresponding unitary transfor- 
mation of his particle 3. If Alice and Bob share a pair of particles that are 
entangled in the original sense of EPR, that is, for continuous variables or 
oo-dimensional states, they also can teleport properties such as the position 
and momentum of particles or the phase and amplitude of electromagnetic 
fields [11]. 

A considerable simplification of quantum teleportation, especially in terms 
of experimental realization, transfers not the quantum state of a particle, but 
rather the manipulation performed on the entangled particle which is given 
to Alice [127] (Fig. 3.5). Again, one first distributes an entangled pair to Alice 
and Bob. But before Alice gets hold of her particle 1 and can perform mea- 
surements on it, the state of this particle is manipulated in another degree of 
freedom. One cannot talk about a two-state system anymore. Rather, particle 
1 now is described in a four-dimensional Hilbert space, spanned by the orig- 
inal degree of freedom and the new one. Formally, however, this mimics the 
two two-state particles given to Alice in the standard quantum teleportation 
scheme. Consequently, a measurement in the four-dimensional Hilbert space 
of particle 1, which perfectly erases the quantum information by mixing the 
two degrees of freedom, is performed. This gives the necessary information for 
Bob to perform the correct unitary transformation on the particle. That way. 



3. Quantum Communication 



73 



the originally mixed state of particle 2 can be turned into a pure state which 
depends on the manipulation initially performed on particle 1. Using such 
a scheme, one can remotely prepare particle 3 in any pure quantum state. 
Thus, it is not necessary to send two real numbers to Bob if one wants him to 
have a certain, pure quantum state prepared on his particle. If he is provided 
with one of a pair of entangled particles, Alice simply has to transmit two 
bits of classical information to Bob. 



3.4 The Experimental Prerequisites 

Before turning to the fascinating applications of entangled systems, let us re- 
view how to produce, how to manipulate and how to measure such quantum 
systems experimentally. The last decade saw incredible progress in the ex- 
perimental techniques for handling various quantum systems. However, there 
are additional challenges when working with entangled systems, especially 
the careful control of interactions and decoherence of the quantum systems. 

In their seminal work Einstein, Podolsky and Rosen considered particles 
which interacted with each other for a certain time and which thus became 
entangled and thereafter exhibited the puzzling, nonclassical correlations. 
The interaction needed to entangle a pair of particles is just the same as von 
Neumann had in mind when describing the measurement process. Ideally, it 
couples two quantum systems in such a way that, if the first system is in one 
of a set of distinguishable (orthogonal) states, the second system will change 
into a well-defined, corresponding state. Let us look at such a coupling for 
the simplest case of two two-state systems. As before, the two basis states 
are denoted as |0) and |1). The coupling is such that if system 1 is in state 
|0)i, system 2 will remain in its initial state, say | 0 ) 2 , whereas if system 1 
is in state |l)i, system 2 will flip to the orthogonal state, i.e. to |1)2. The 
nonclassical features arise if system 1 is in a superposition of its basis states. 
Then, coupling it with the second system results in an entangled state: 

|0)l|0)2^|0)i|0)2, 

|l)l|0)2^|l)l|l)2, 

^ (|0)i + |l)i) |0)2 ^ ^ (|0)i|0)2 + |l)i|l)2) . (3.12) 

Although this basic principle of producing entangled states has been 
known since the very beginning of quantum mechanics, until recently there 
was no physical system where the necessary coupling could be realized. The 
progress in cavity QED [36] and ion trap experiments [128] allowed the first 
observation of entanglement between two atoms or two ions. These experi- 
ments are of great importance for the further development of experimental 
quantum computation. However, for quantum communication one needs to 
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transfer the entangled particles over reasonable distances. Thus photons (with 
wavelengths in the visible or near infrared) are clearly a better choice. For 
entangling photons via such a coupling, various methods have been proposed 
and partially realized [129, 130, 131] but still need to be investigated more 
thoroughly. Fortunately, the process of parametric down-conversion offers an 
ideal source of entangled photon pairs without the need for strong coupling 
(see Sect. 3.4.2). 

To perform Bell-state analysis, one first has to transform the entangled 
state into a product state. This is necessary since two particles can be an- 
alyzed only if they are measured separately. Otherwise one would need to 
entangle the two measurement apparatuses, each of which analyzes one of the 
two particles - clearly an even more challenging task. In principle, a disentan- 
gling transformation can be performed by reversing the entangling interaction 
described above. However, as long as such couplings are not achievable, one 
has to find replacements. In the following it is shown how two-particle inter- 
ference can be employed for partial Bell-state analysis (see Sect. 3.4.3). Since 
the manipulations and unitary transformations have to be performed on only 
one quantum particle at a time, this does not create new obstacles. These 
operations are often routine; in the case of light they have been routine for 
two centuries. 

3.4.1 Entangled Photon Pairs 

Entanglement between photons cannot be generated by coupling them via 
an interaction yet. However, there are several emission processes, such as 
atomic cascade decays or parametric down-conversion, where, owing to the 
conservation of energy and of linear or angular momentum, the properties of 
two emitted photons become entangled. 

Historically, entanglement between spatially separated quanta was first 
observed in measurements of the polarization correlation between 7 ~'’ 7 ~ emis- 
sions in positron annihilation [132], soon after Bohm’s proposal for observing 
EPR phenomena in spin-1/2 systems. After Bell’s discovery that contradic- 
tory predictions between quantum theories can actually be observed, a series 
of measurements was performed, mostly with polarization-entangled photons 
from a two-photon cascade emission from calcium [133]. In these experiments, 
the two photons were in the visible spectrum, and thus could be manipulated 
and controlled by standard optical techniques. Of course, this is a great ad- 
vantage compared with the positron annihilation source; however, the two 
photons are now no longer emitted in opposite directions, since the emitting 
atom carries away some randomly determined momentum. This makes exper- 
imental handling more difficult and also reduces the brightness of the source. 
The process of parametric down-conversion now offers a new possibility for 
efficiently generating entangled pairs of photons [111]. 

When light propagates through an optically nonlinear medium with second- 
order nonlinearity (only possible in noncentrosymmetric crystals), the 
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Fig. 3.6. The different relations between the emission directions for type I and type 
II down-conversion 



conversion of a light quantum from the incident pump field into a pair of 
photons in the “idler” and “signal” modes can occur. In principle, this can 
be seen as the inverse of the frequency-doubling process in nonlinear optics 
[134], 

As mentioned above, energy and momentum conservation can give rise 
to entanglement in various degrees of freedom, such as position-momentum 
and time-energy entanglement. However, the interaction time and volume 
will determine the sharpness and quality of the correlations observed , which 
are formally obtained by integration of the interaction Hamiltonian [135]. 
The interaction time is given by the coherence time Tc of the UV pump light; 
the volume is given by the extent and spatial distribution of the pump light 
in the nonlinear crystal. 

The relative orientations of the direction and polarization of the pump 
beam, and the optic axis of the crystal determine the actual direction of 
the emission of any given wavelength. We distinguish two possible alignment 
types (Fig. 3.6): for type I down-conversion, the pump has, for example, 
the extraordinary polarization and the idler and signal beams both have the 
ordinary polarization. Different colors are emitted into cones centered on the 
pump beam. 

In type II down-conversion, the pump has the extraordinary polarization 
and, in order to fulfill the momentum conservation condition inside the crystal 
(phase-matching), the two down-converted photons have different, for most 
directions orthogonal, polarizations, offering the possibility of a new source 
of polarization-entangled photon pairs (Sect. 3.4.2). 

One can distinguish two basic ways to observe entanglement. In the first 
way, by selecting detection events one can chose a subensemble of possible 
outcomes which exhibits the nonclassical features of entangled states.® This 
additional selection seems to contradict the spirit of EPR-Bell experiments; 
however, it was shown recently, that, after a detailed analysis of all detection 
events, the validity of local hidden- variable theories can be tested on the basis 

® For the observation of polarization entanglement, see [136]. For momentum entan- 
glement, see the proposal [137] and the experimental results [138]. Time-energy 
entanglement was proposed in [139]. Experiments are described in [140]. 
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Horizontal 



Fig. 3.7. Photons emerging from type II 
Vertical down-conversion. The photons are always 
emitted with the same wavelength but have 
orthogonal polarizations. At the intersection 
points their polarizations are undefined but 
different, resulting in entanglement 



of refined Bell inequalities [141]. Therefore, such sources can also be useful 
for entanglement-based quantum cryptography [142]. 

In the second way, true entangled photon pairs can be generated. This is 
essential for all the other quantum communication schemes, where one cannot 
use the detection selection method. Several methods to obtain momentum- 
entangled pairs [143] have been demonstrated experimentally [144], but are 
extremely difficult to handle experimentally owing to the huge requirements 
on the stability of the whole setup. Any phase change, i.e. a change in the path 
lengths by as little as 10 nm, is devastating for the experiment. Also, the re- 
cently developed source of time-energy-entangled photon pairs [145] partially 
shares these problems and, to avoid detection selection, requires fast optical 
switches. Fortunately, with polarization entanglement as produced by type 
II parametric down-conversion, the stability requirements are considerably 
more relaxed. 



3.4.2 Polarization-Entangled Pairs from Type II 
Down- Conversion 

In type II down-conversion, the down-converted photons are emitted into 
two cones, one with the ordinary polarization and the other with the ex- 
traordinary polarization. Because of conservation of transverse momentum 
the photons of each pair must lie on opposite sides of the pump beam. For 
the proper alignment of the optic axis of the nonlinear crystal, the two cones 
intersect along two lines (see Fig. 3.7) [111, 146]. Along the two directions 
(“1” and “2”) where the cones overlap, the light can be essentially described 
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by an entangled state: 

m = {\H),\V}2+e^^\V)^\H)2), (3.13) 

where the relative phase a arises from the crystal birefringence, and an overall 
phase shift is omitted. Using an additional birefringent phase shifter (or even 
by slightly rotating the down-conversion crystal itself), the value of ol can be 
set as desired, e.g. to the value 0 or tt. Thus, polarization-entangled states 
are produced directly out of a single nonlinear crystal (beta barium borate, 
BBO), with no need for extra beam splitters or mirrors and no requirement 
to discard detected pairs. 

Best of all, by using two extra birefringent elements, one can easily pro- 
duce any of the four orthogonal Bell states. For example, when starting with 
the state a net phase shift of tt and thus a transformation to the state 

may be obtained by rotating a quarter- wave plate in one of the two 
paths by 90° from the vertical to the horizontal direction. Similarly, a half- 
wave plate in one path can be used to change a horizontal polarization to 
vertical and to switch to the states 

The birefringent nature of the down-conversion crystal complicates the 
actual entangled state produced, since the ordinary and the extraordinary 
photons have different velocities inside the crystal, and propagate along dif- 
ferent directions even though they become parallel and, for short crystals, 
collinear outside the crystal. The resulting longitudinal and transverse walk- 
off between the two polarizations in the entangled state is maximal for pairs 
created near the entrance face of the crystal, which consequently acquire the 
greatest time delay and relative lateral displacement. Thus the two possible 
emissions become, in principle, distinguishable by the order in which the de- 
tectors would fire or by their spatial location, and no entanglement will be 
observable. However, the photons are produced coherently along the entire 
length of the crystal. One can thus completely compensate for the longitudi- 
nal walk-off and partially for the transverse walk-off by using two additional 
crystals, one in each path [147]. By verifying the correlations produced by 
this source, one can observe strong violations of Bell’s inequalities (modulo 
the typical auxiliary assumptions) within a short measurement time [31]. 

The experimental setup is shown in Fig. 3.8a. The 351.1 nm pump beam 
(150mW) is obtained from a single-mode argon ion laser, followed by a dis- 
persion prism to remove unwanted laser fluorescence (not shown) [111]. Our 
3 mm long BBO crystal was nominally cut such that the angle between 
the optic axis and the pump beam, was 49.2°, to allow collinear, degener- 
ate operation when the pump beam is precisely orthogonal to the surface. 
The optic axis was oriented in the vertical plane, and the entire crystal was 
tilted (in the plane containing the optic axis, the surface normal and the 
pump beam) by 0.72°, thus increasing the effective value of inside the 
crystal to 49.63°. The two cone overlap directions, selected by irises before 
the detectors, were consequently separated by 6.0°. Each polarization ana- 
lyzer consisted of two-channel polarizers (polarizing beam splitters) preceded 
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Fig. 3.8. (a) Experimental setup for the observation of entanglement produced by 
a type II down-conversion source. The additional birefringent crystals are needed 
to compensate for the birefringent walk-off effects from the first crystal, (b) Coin- 
cidence fringes for the Bell states and |^'’^)(o) obtained when varying the 

analyzer angle &i, with O 2 set to 45° 



by a rotatable half-wave plate. The detectors were cooled silicon avalanche 
photodiodes operated in the Geiger mode. Coincidence rates C { 01 , 62 ) were 
recorded as a function of the polarizer settings 61 and 02 - 

In this experiment the transverse walk-off (0.3 mm) was small compared 
with the coherent pump beam width (2 mm), so the associated labeling effect 
was minimal. However, it was necessary to compensate for the longitudinal 
walk-off, since the 3.0 mm BBO crystal produced a time delay which was 
about the same as the coherence time of the detected photons («390fs, de- 
termined by interference filters with a width of 5nm at 702 nm). We used 
an additional BBO crystal (1.5 mm thick) as a compensator in each of the 
paths, preceded by a half-wave plate to exchange the roles of the horizontal 
and vertical polarizations. 

Under such conditions, we now obtain routinely a coincidence fringe visi- 
bility (as polarizer 2 is rotated, with polarizer 1 fixed at —45°) of more than 
97%, for irises with a size of 2 mm at a distance of 1.5 m from the crystal (Fig. 
3.8b). The high quality of this source is crucial for the overall performance of 
our experiments in quantum dense coding [114], quantum cryptography [113] 
and tests of Bell’s inequalities [31]. For the later experiments, the photons 
were coupled into single-mode fibers, to bridge long distances of the order 
of 400 m. To achieve a high coupling, the pump beam should be slightly fo- 
cused into the BBO crystal, to optimally match the microscope objectives 
used. Since the compensation crystals partially compensate the transverse 
walk-off, focusing down to 0.2 mm is not crucial. Visibilities of more than 
98% have been obtained this way, with an overall collection and detection 
efficiency of 10%. 

Such a source has a number of distinct advantages. It seems to be rel- 
atively insensitive to larger collection irises, an important feature in exper- 
iments where high count rates are crucial. In addition, owing to its sim- 
plicity, the source is much quicker to align than other down-conversion set- 
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ups and is remarkably stable. One of the reasons is that phase drifts are 
not detrimental to a polarization-entangled state unless they are birefrin- 
gent, i.e. polarization-dependent -- this is a clear advantage over experiments 
with momentum-entangled or energy-time-entangled photon pairs. Recently, 
Kwiat and coworkers tested sandwiched type I crystals and achieved, for thin 
crystals, a significantly higher relative yield of entangled photon pairs [148]. 
Also, utilizing cavities to enhance the pump field in the nonlinear crystal can 
boost the output by a factor of 20 [149]. This gives hope that even more 
efficient generation of entangled photon pairs will be obtained in the future. 

3.4.3 Interferometric Bell-State Analysis 

At the heart of Bell-state analysis of a pair of particles is the transformation 
of an entangled state to an unentangled, product state. The necessary cou- 
pling, however, has not been achieved for photons yet. But it turns out that 
interference of two entangled particles, and thus the photon statistics behind 
beam splitters depend on the entangled state that the pair is in [112, 150, 151]. 



The Principle. Let us discuss first the generic case of two interfering par- 
ticles. If we have two otherwise indistinguishable particles in different beams 
and overlap these two beams at a beam splitter, we ask ourselves, what is the 
probability to find the two particles in different output beams of the beam 
splitter (Fig. 3.9a). Alternatively we can ask, what is the probability that 
two detectors, one in each output beam, detect one photon each. 

If we performed this experiment with fermions, we would at first naively 
expect the two fermions to arrive in different output beams. This is sug- 
gested by the Pauli principle, which requires that the two particles cannot 
be in the same quantum state, that is, they cannot exit in the same output 
beam. Analogously, interference of bosons at a beam splitter will result in 
the expectation of finding both bosons in one output beam. For a symmetric 
50/50 beam splitter, it is fully random whether the two bosons will be de- 
tected in the upper or the lower detector, but they will be always detected 
by the same detector. However, it is important to realize that the statements 
above are only correct if one disregards the internal degrees of freedom of the 
interfering particles. 

Ultimately, the reason for the different behaviors lies in the different sym- 
metries of the wave functions describing bosonic and fermionic particles [150]. 
There are four different possibilities for how the two particles could propa- 
gate from the input to the output beams of the beam splitter. We obtain 
one particle in each output if both particles are reflected or both particles 
are transmitted; we observe both particles at one detector if one particle 
is transmitted and the other reflected, or vice versa. For the antisymmetric 
states of fermions, the two possibilities of both particles being transmitted 
and both being reflected interfere constructively, resulting in firing of each 
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Fig. 3.9. (a) Interference of two particles at a beam splitter. The observation of 
coincident detection, i.e. detection of one particle at each of the two detectors, is 
sensitive to the symmetry of the spatial component of the quantum state of the 
combined system, (b) Bell-state analyzer for identifying the Bell states and 

|lF~) by observing different types of coincidences. The other two Bell states 
exhibit the same detection probabilities (both photons are detected by one detector) 
for this setup and cannot be distinguished 



of the two detectors. For the symmetric states of bosons, these two ampli- 
tudes interfere destructively, giving no simultaneous detection in different 
output beams [152]. For photons with identical polarizations, which means 
for bosons, this interference effect has been known since the experiments by 
Hong et al. [153],^ but up to now it has not been observed for fermions yet. 

What kinds of interference effects of two photons at a beam splitter are to 
be expected if we consider also the internal degree of freedom of the photons, 
i.e. their polarization? In particular, if we interfere two polarization-entangled 
photons at a beam splitter, the Bell state describes only the internal degree of 
freedom. Inspection of the four Bell states shows that the state \^~) is anti- 
symmetric, whereas the other three are symmetric. However, if two particles 
interfere at a nonpolarizing beam splitter, what matters is only the spatial 
part of the wave function. The symmetry of the wave function is determined 
by the requirement that for two photons, the total state has to be symmetric 
again. We therefore obtain, for the total state of two photons in the anti- 
symmetric Bell state formed from two beams a and b at the beam splitter, 

m ^ i\H)i\V)2 - \V),\Hh) {\a)^\bh - \b)^\ah) . (3.14) 

For further experiments and theoretical generalizations, see [154]. 
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This means that, for the state \'P~), we also have an antisymmetric spatial 
part of the wave function and thus expect a different detection probability, 
that is, different coincidences between the two detectors, compared with the 
other three Bell states. 

We therefore can discriminate the state from all the other states. It 
is the only one which leads to coincidences between the two detectors in the 
output beams of the beam splitter. Can we also identify the other Bell states? 
If two photons are in the state they will both propagate in the same 

output beam but with orthogonal polarizations in the horizontal/ vertical 
(H/V) basis, whereas two photons in the state |<?+) or in the state |^“), 
which also both leave the beam splitter in the same output arm, have the 
same polarization in the H/V basis. Thus we can discriminate between the 
state and the states by a polarization analysis in the H/V basis 
and by observing either coincidences between the outputs of a two-channel 
polarizer or both photons again in only one output (Fig. 3.9b). Note that 
reorientation of the polarization analysis allows one to separate any other of 
these three states from the other two, but it is not possible to distinguish 
between all of them simultaneously [155]. If the photons were entangled in 
yet another degree of freedom, i.e. they were four-state systems rather than 
regular qubits, one could also discriminate between the states and \‘P~) 
[156]. But up to now, no quantum communication scheme seems to have 
profited from this fact. 

Summarizing, we conclude that two-photon interference can be used to 
identify two of the four Bell states, with the other two giving the same third 
detection result. One thus cannot perform complete Bell-state analysis by 
these interferometric means, but we can identify three different settings in 
quantum dense coding and, for teleportation, even identification of only one 
of the Bell states is sufficient to transfer any quantum state from one particle 
to another, although then only in a quarter of the trials. 



Bell- State Analysis of Independent Photons. The above description 
of how to apply two-photon interference for Bell-state analysis can give only 
some hints about the possible procedures. One intuitively feels that the neces- 
sary joint detection of the two photons has to be “in coincidence” . But what 
really are the experimental requirements for the two photons to interfere? 
The coincidence conditions can be obtained using a more refined analysis 
that takes the multimode nature of the states involved into account [157]. 

Interference occurs only if the contributing possibilities for finding one 
photon in each output are indistinguishable. If the two photons come from 
different sources or, as is the case in the experiments, from different down- 
conversion emissions, there might be some timing information, in our case 
detection of the second photon from each down-conversion, which might ren- 
der the possibilities distinguishable. 
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For example, if we detect one photon behind the beam splitter at almost 
the same time as one of the additional down-conversion photons, we can infer 
the origin of the photon that is to interfere. However, if the time difference 
between the detection events of the two interfering photons, that is, the over- 
lap at the beam splitter, is much less than their coherence time, then the 
detection of any other photon cannot give any additional information about 
their origin. This ultra-coincidence condition requires the use of narrow filters 
in order to make the coherence time as long as possible. However, even if we 
consider using state-of-the-art interference filters that yield a coherence time 
of about 3 ps, no detectors fast enough exist at present. And an even stronger 
filtering by Fabry-Perot cavities (to achieve the necessary coherence time of 
about 500 ps) results in prohibitively low count rates. Only a considerable 
increase of the number of photon pairs emitted into a narrow wavelength 
window may allow one to use this technique (e.g. with a subthreshold OPO 
configuration as demonstrated in [158]). 

The best choice, as it turns out, is not to try to detect the two photons 
simultaneously, but rather to generate them with a time definition much 
better than their coherence time. Consider two down-conversion processes 
pumped by pulsed UV beams (using either two crystals or, as is the case 
in our experiments, one crystal pumped by two passages of a UV beam). 
Again we attempt to observe interference between two photons, one from each 
down-conversion process. Then, without any narrow filters in the beams, the 
tight time correlation of the photons coming from the same down-conversion 
permits one again to associate simultaneously detected photons with each 
other. This provides path information and hence prohibits interference. 

We now insert filters before (or behind) the beam splitter. With stan- 
dard filters, and thus also with high enough count rates, one easily achieves 
coherence times on the order of 1 ps. And it is possible to pump the two down- 
conversion processes with UV pulses with a duration shorter than 200 fs. Thus 
it follows that the photons detected behind the beam splitter carry practi- 
cally no information anymore on the detection times of their twin photons, 
and, vice versa, detection of those latter photons does not give which-path 
information, which would destroy the interference. 

The “coincidence time” for registering the photons now can be very long; 
it merely needs to be shorter than the repetition time of the UV pulses, which 
is on the order of 10 ns for commercially available laser systems. One thus 
can expect very good visibility of interference and very good precision of the 
Bell-state analysis. 

3.4.4 Manipulation and Detection of Single Photons 

For polarization-entangled photons, the unitary transformations transform- 
ing between the four Bell states can be performed with standard half-wave 
and quarter-wave retardation plates. 



3. Quantum Communication 



83 



As mentioned before, in order to have the maximum freedom in setting 
any of the Bell states, one inserts one half-wave and one quarter-wave plate 
into the beam. By precompensating the additional quarter-wave shifts with 
the compensator plates of the EPR source, one obtains at the output of the 
transformation plates the state if both optic axes are aligned along the 
vertical direction. Rotation of only the quarter-wave plate to the horizontal 
direction transforms this state to and rotation of only the half-wave 

plate by 45° gives \'P~). Finally, rotating one plate by 90° and the other one 
by 45° gives 

For initial experimental realizations of the ideas of quantum communica- 
tion, such a static polarization manipulation is sufficient. However, for quan- 
tum cryptography, and also for practical applications of other schemes, one 
would like to be able to switch the unitary transformation rapidly to any po- 
sition. This can be achieved by fast Pockels cells. Depending on the applied 
voltage, these devices have different indices of refraction for two orthogonal 
polarization components, and can be used in a similar way to the quartz 
retardation plates [31]. 

Detection of the single photons has been performed using silicon avalanche 
photodiodes operated in the Geiger mode. The diodes used have a detection 
efficiency of about 40%. Owing to losses in the interference filters and other 
optical components, the overall detection efficiency of a photon emitted from 
the source was around 10% in cw experiments; for experiments using a pulsed 
source, we achieved an efficiency of only about 4%. In many interference 
experiments, a good definition of the transverse-mode structure of the beams 
is necessary. An ideal solution for achieving high interference contrast is thus 
to couple the output arms of a beam splitter into single-mode fibers and 
connect these fibers to pigtailed avalanche photodiodes. The single-mode fiber 
acts as a very good spatial filter for the transverse modes and couples the 
light efficiently to the diodes. 



3.5 Quantum Communication Experiments 

3.5.1 Quantum Cryptography 

In the first experiments [159], the researchers concentrated on the distribution 
of pairs of entangled photons over large lengths of fibers, rather than on 
including fast, random switching. In these indoor experiments, where the 
optical fiber was wound on a fiber drum, one of the photons was chosen 
to have a wavelength of A = 1300 nm, and the other in the near infrared 
for optimal detection efficiency (here the down-conversion was pumped by 
a krypton ion laser at 460 nm). Time-energy entanglement was used, with 
asymmetric interferometers at the observer stations and selection of true 
coincidences. Such a scheme allows the correlated photons to have a wide 
frequency distribution, and thus a relatively high intensity, since the visibility 
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Alice Source Bob 




Fig. 3.10. Setup for entanglement-based quantum cryptography. The polarization- 
entangled photons are transmitted via optical fibers to Alice and Bob, who are 
separated by 400 m, and both photons are analyzed, detected and registered in- 
dependently. After a measurement run, the quantum keys are established by Alice 
and Bob through classical communication over a standard computer network 



of the interference effects depends on the monochromaticity of the pump laser 
light. 

In a more recent experiment, both photons were produced with a wave- 
length around 1300 nm. Here, for the first time, laser diodes (A = 650 nm) 
were used for pumping the down-conversion, in contrast to the expensive 
laser systems used in other experiments. This allowed the demonstration of 
nonclassical correlations between two observers separated by more than 10 
km in the Geneva area [30]. Standard optical telecommunication fibers con- 
necting offices of the Swiss telecommunication company were used to send 
the photons to two interferometers, where phase modulation served to set 
the analysis parameters. The robustness of the source, together with the 
high degree of quantum entanglement, opens new prospects for this secure 
communication technique.® 

The scheme of the first realistic demonstration of entanglement-based key 
distribution is sketched in Fig 3.10 [113]. The source uses type II parametric 
down-conversion in BBO, pumped with an argon ion laser working at a wave- 
length of 351 nm and a power of 350 mW. The photons, with a wavelength 
of 702 nm, are each coupled into 500 m long optical fibers and transmitted to 
Alice and Bob, who are separated by 400 m. Alice and Bob both use a Wol- 
laston polarizing beam splitter as a polarization analyzer. We shall associate 
a detection of parallel polarizations, -1-1, with the key bit 1 and detection of 
orthogonal polarizations, —1, with the key bit 0. 

Electrooptic modulators in front of the analyzers rapidly switch (<15 ns) 
the axis of the analyzer between two desired orientations, controlled by quan- 
tum random signal generators. These quantum random number generators 
are based on the quantum mechanical process of splitting a beam of photons 



® For recent experiments on entanglement-based cryptography, see [160]. 
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and have a correlation time of less than 100 ns [161]. The photons are de- 
tected by silicon avalanche photodiodes, and time interval analyzers on local 
personal computers register all detection events as time stamps together with 
the settings of the analyzers and the detection results. 

Quantum key distribution is started by a single light pulse sent from the 
source to Alice and Bob via a second optical fiber. After a run of about 5 s 
duration has been completed, Alice and Bob compare their lists of detec- 
tions to extract the coincidences. In order to record the detection events very 
accurately, the time bases in Alice’s and Bob’s time interval analyzers are 
controlled by two rubidium oscillators. Overall, the system has a measured 
rate of total coincidences of ~ 1700 per second, and a collection efficiency of 
each photon path of 5%. All the necessary equipment for the source, Alice and 
Bob have been proven to operate outside shielded laboratory environments 
with a very high reliability. 

For the realization of entanglement-based quantum cryptography using 
the Wigner inequality, Alice switches the analyzer randomly between —30° 
and 0°, and Bob between 0° and -1-30°. After a run, Alice and Bob ex- 
tract from the coincidences the probabilities p++(0°, 30°), p++(— 30°, 0°), and 
30°, 30°) for the corresponding analyzer settings. We obtain — 0.112 ± 
0.014 for the left-hand side of the Wigner inequality (3.7), which is in good 
agreement with the predictions of quantum mechanics, and the coincidences 
obtained at the parallel settings, (0°, 0°), can be used as a quantum key. In a 
typical run, Alice and Bob established 2162 bits of raw quantum key material 
at a rate of 420 baud, and observed a quantum bit error rate (QBER) of 3.4%. 
By biasing the frequencies of the analyzer combinations, the production rate 
of the quantum keys can be increased to about 1700 baud without sacrificing 
security. 

To demonstrate the entanglement-based BB84 scheme, Alice’s and Bob’s 
analyzers both switched independently and randomly between 0° and 45°. 
After a measurement run, Alice and Bob extracted the coincidences measured 
with parallel analyzers to generate the quantum key. In the experiment, Alice 
and Bob collected 80 000 bits of quantum key at a rate of 850 baud and 
observed a quantum bit error rate of 2.5%. To correct the remaining errors 
and ensure the secrecy of the key, various classical error correction and privacy 
amplification schemes have been developed. With a very fast and efficient 
algorithm, a single iteration gives 49 984 bits with a significantly reduced 
QBER of 0.40% [113]. 

3.5.2 Quantum Dense Coding 

For the first realization of this quantum communication scheme, the experi- 
ment consisted of three distinct parts (Fig. 3.11): the EPR source, generating 
entangled photons in a well-defined state; Alice’s station, for encoding the 
messages by a unitary transformation of her particle; and Bob’s Bell-state 
analyzer, for reading the signal sent by Alice. 
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Fig. 3.11. Experimental setup for quantum dense coding. The two entangled pho- 
tons created by type II down-conversion are distributed to Alice and Bob. Alice 
sends her photon, after manipulation with birefringent plates, to Bob, who can read 
the encoded information by interferometric Bell-state analysis. The path length de- 
lay A is varied to achieve optimal interference 



The polarization-entangled photons, with a wavelength of A = 702 nm, 
were, similarly to the quantum cryptography experiment, produced by de- 
generate noncollinear type II down-conversion in a nonlinear BBO crystal 
along two distinct emission directions (carefully selected by 2 mm irises, 

1.5 m away from the crystal). One beam was directed to Alice’s encoding 
station, the other directly to Bob’s Bell-state analyzer. The settings were 
such that we obtained the entangled state behind the compensation 

crystals (not shown in the figure) and Alice’s manipulation unit when the 
retardation plates were both set to the vertical direction after compensation 
of birefringence in the BBO crystal. 

The beam manipulated in Alice’s encoding station was combined with 
the other beam in Bob’s Bell-state analyzer. Bob’s analyzer consisted of a 
single beam splitter followed by two-channel polarizers in each of its outputs, 
and proper coincidence analysis between four single-photon detectors. In the 
alignment procedure, optical trombones were employed to equalize the path 
lengths to well within the coherence length of the down-converted photons 
(Z\A = 100 /xm), in order to observe the two-photon interference. 

To characterize the interference observable at Bob’s Bell-state analyzer, 
we varied the path length difference A of the two beams with the optical 
trombone. If the path length difference is larger than the coherence length, 
no interference occurs and one obtains classical statistics for the coincidence 
count rates at the detectors. With optimal path length tuning, interference 
enables one to read the encoded information. 
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Fig. 3.12. Coincidence rates Chv (•) and Chv' (°) as functions of the path length 
difference A, when the states (left) or (right) are analyzed by Bob’s 

interferometric Bell-state analyzer 



Figure 3.12 shows the dependence of the coincidence rates Chv (•) and 
Chv' (o) on the path length difference, when either the state |lF+) (left) or 
the state |lF“) has been sent to the Bell-state analyzer (the rates Ch'V' and 
Ch'v display analogous behavior; we use the notation Cab for the coincidence 
rate between the detectors Da and Db). For perfect path length tuning, Chv 
reaches its maximum for |lF+) (left) and vanishes (apart from noise) for |!F”) 
(right). Chv' displays the opposite dependence and clearly signifies |!F“). The 
results of these measurements imply that if both photons are detected, we 
can identify the state |lF+) with a reliability of 95%, and 93% for the state 

The performance of the dense-coding transmission is influenced not only 
by the quality of the alignment procedure, but also by the quality of the 
states sent by Alice. In order to evaluate the latter, the beam splitter was 
translated out of the beams. Then an Einstein-Podolsky-Rosen-Bell-type 
correlation measurement analyzed the degree of entanglement of the source, 
as well as the quality of Alice’s transformations. The correlations were only 1- 
2% higher than the visibilities with the beam splitter in place, which means 
that the quality of this experiment is limited more by the quality of the 
entanglement of the two beams than by that of the interference achieved. 

When using Si avalanche diodes in the Geiger mode for single-photon de- 
tection, a modification of the Bell-state analyzer is necessary, since then, for 
the states one has to register the two photons leaving the Bell-state 

analyzer via a coincidence detection. One possibility is to avoid interference 
at all for these states by introducing polarization-dependent delays before 
Bob’s beam splitter. Another approach is to split the incoming two-photon 
state at an additional beam splitter and to detect it (with 50% likelihood) 
by a coincidence count between detectors in each output (inset of Fig. 3.13). 
For the purpose of a proof-of-principle demonstration, we put such a con- 
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Fig. 3.13. Coincidence rates as functions of the path length difference A. Because 
of the nature of the Si avalanche photodiodes, the extension shown in the inset is 
necessary for identifying two-photon states in one output 

figuration in place of detector Dh only. Figure 3.13 shows the increase of 
the coincidence rate Cjjjj (□) for zero path length difference, with the other 
rates at the background level, when Alice sends the state |^“). Since we can 
now distinguish the three different messages, the stage is set for the quantum 
dense-coding transmission. Figure 3.14 shows the various coincidence rates 
(normalized to the corresponding maximum rate of the transmitted state), 
when the ASCII codes of “KM°” (i.e. codes 75, 77, 179) were sent in only 15 
trits instead of 24 classical bits. 

From this measurement, one can also obtain a signal-to-noise ratio by 
comparing the rates signifying the actual state with the sum of the two other 
rates registered. The ratios for the transmission of the three states varied 




Fig. 3.14. “1.58 bits per photon” quantum dense coding: the ASCII codes for the 
letters “KM°” (i.e. 75, 77, 179) are encoded in 15 trits instead of the 24 bits usually 
necessary. The data for each type of encoded state are normalized to the maximum 
coincidence rate for that state 
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owing to the different visibilities of the corresponding interferences and were 
about 14% and 9%. The signal-to-noise ratio achieved results in an actual 
channel capacity of 1.13 bits per transmitted (and detected) two-state photon 
and thus clearly exceeds the channel capacity of 1 bit achievable with noise- 
free classical communication. 

3.5.3 Quantum Teleportation of Arbitrary Qubit States 

In this experiment, polarization-entangled photons were produced again by 
type II down-conversion in a nonlinear BBO crystal (see Fig. 3.15), but here 
the UV beam was pulsed to obtain a high time definition of the creation of 
the pairs (the pulses had a duration of about 200 fs and A = 394 nm). The 
entangled pair of photons 2 and 3 is produced in the first passage of the 
UV pulse through the nonlinear crystal, and the pair 1 and 4 after the pulse 
has been reflected at a mirror back through the crystal. Mirrors and beam 
splitters (BS) are used to steer and to overlap the light beams. Polarizers 
(Pol) and polarizing beam splitters (PBS), together with birefringent retar- 
dation plates (A/2), prepare and analyze the polarization of the photons. All 
single-photon detectors indicated in the figure (silicon avalanche photodiodes 
operated in the Geiger mode) are equipped with narrow-band interference fil- 
ters; the detectors of Alice’s Bell-state analyzer are equipped with additional 
single-mode fiber couplers for spatial filtering. 

For the first demonstration of quantum teleportation [10, 11], we prepared 
particle 1 in various nonorthogonal polarization states using a polarizer and a 
quarter- wave plate (not shown) . Behind Bob’s “receiver” , polarization analy- 
sis was performed to prove the dependence of the polarization of photon 3 on 
the polarization of photon 1. (In this case we used the registration of photon 
4 only to define the time of appearance of photon 1.) 

The first task now is to prove that no information about the state of 
photon 1 is revealed during the Bell-state measurement of Alice. Figure 3.16 
shows the coincidence rate between detectors fl and f2 when the overlap 
of photons 1 and 2 at the beam splitter was varied (for this, we changed 
the position of the mirror reflecting the pump beam back into the crystal). 
The characteristic interference effect, a reduction of the coincidence rate, 
occurs only around zero delay. Outside this region, which is on the order of 
the coherence length of the detected photons, no reduction occurs, and the 
two photons are detected in coincidence with 50% probability. Within the 
statistics, there is no difference between the two data sets, although particle 1 
was prepared in two mutually orthogonal states (-1-45° and —45°). Obviously, 
Alice has no means to determine which of the two states particle 1 was in 
after the projection into the Bell-state basis. 

Figure 3.17 shows the polarization of photon 3 after the teleportation 
protocol has been performed, again as the delay between photons 1 and 2 is 
varied. When interference occurs at the beam splitter, i.e. around zero delay, 
the polarization of photon 3 is given by the settings for photon 1. The two 
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Fig. 3.15. Experimental setup for quantum teleportation. A UV pulse passing 
through a nonlinear crystal creates an entangled pair of photons 2 and 3 in the state 
\^~), which is distributed to Alice and Bob. During its second passage through the 
crystal, after retroflection the UV pulse creates another pair of photons, of which 
one is prepared in the initial state to be teleported (photon 1), and the other one 
(4) serves as a trigger indicating that a photon to be teleported is on its way. Alice 
then looks for coincidences behind her beam splitter, where the initial photon and 
one of the ancillaries are superposed. Bob, after receiving the classical information 
that Alice has obtained a coincidence count identifying the \^~) Bell state, knows 
that his photon 3 is in the initial state of photon 1, which then can be verihed using 
polarization analysis 




Fig. 3.16. Coincidence rate between the two detectors of Alice’s Bell-state analyzer 
as a function of the delay between the two photons 1 and 2. The data for the -1-45° 
and —45° polarizations of photon 1 are equal within the statistics, which shows 
that no information about the state of photon 1 is revealed to Alice 
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graphs show the results obtained when the initial polarization of photon 1 
was set either to 45° or to vertical polarization and then the polarization of 
photon 3 along the corresponding direction was analyzed. The reduction in 
the polarization to about 65% is due to the limited degree of entanglement 
between photons 2 and 3 (85%), and to the reduced contrast of the interfer- 
ence at the beam splitter as a consequence of the relatively short coherence 
time of the detected photons. Of course, better beam definition by narrow 
pinholes and more stringent filtering could improve this value. However, this 
would cause further, unacceptable loss in the fourfold coincidence rates. Each 
of the polarization data points shown was obtained from about 100 four-fold 
coincidence counts in 4000 s. 
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Fig. 3.17. Polarization of photon 3 after teleportation, compared with the po- 
larization initially prepared on photon 1. The analyzer testing the quality of the 
teleportation performed by Alice and Bob was oriented parallel to the initial po- 
larization 



These measurements and also runs with the initial polarization along 
other directions demonstrate the ability to teleport the polarization of any 
pure state. Of course, since the directions used are mutually nonorthogonal, 
one can infer that the scheme works for any arbitrary quantum state. How- 
ever, there is a much more direct way to experimentally demonstrate the full 
power of quantum teleportation. 

One way to demonstrate that any arbitrary quantum state can be trans- 
ferred is to use the fact that we can also obtain entanglement between photons 
1 and 4 (Fig. 3.18). After the polarizer was removed from arm 1 and put into 
arm 4, the state of 1 was not defined anymore, but still could be teleported 
to photon 3; this was demonstrated by showing that now the entanglement 
had been swapped to photons 3 and 4. 

The state of photon 1 (Fig. 3.18), which is part of an entangled pair (pho- 
tons 1 and 4), is fully undetermined and is formally described by a mixed 
state. If one can teleport this state to another photon, i.e. to Bob’s photon 3, 
we expect to find this photon in a mixed state, that means it is unpolarized. 
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Fig. 3.18. Experimental setup that demonstrates teleportation of arbitrary quan- 
tum states: by teleporting the as yet undefined state of photon 1 to photon 3, one 
is able to swap the entanglement, initially between particles 1 and 4 and between 
particles 2 and 3, to the newly entangled pair of particles 3 and 4 by projecting 1 
and 2 into an entangled pair 

Now, since Bob’s photon was originally also part of an entangled pair (pho- 
tons 2 and 3), it was unpolarized anyway. One might conclude that here we 
did not achieve anything. However, if one determines not only the polariza- 
tion of photon 3 but the correlations between photons 3 and 4, one finds that 
now these two photons, which have been produced independently by different 
processes, are entangled [115]. 

Figure 3.19 verifies the entanglement between photons 3 and 4, condi- 
tioned on coincidence detection of photons 2 and 3. Varying the angle 0 of 
the polarizer in arm 4 causes a sinusoidal variation of the count rate, here 
with the analyzer of photon 3 set to ±45°. This shows that we did not tele- 
port just a mixed state, but actually the as yet undetermined state of the 
entangled photon. 

These experiments present the first demonstration of quantum teleporta- 
tion, that is, the transfer of a qubit from one two-state particle to another. 
In the meantime, further steps have been achieved, in particular the remote 
preparation of the state of Bob’s photon (sometimes also called “telepor- 
tation”) [10] and, especially important, the teleportation of the state of an 
electro-magnetic field [11]. The latter is the first example of teleportation of 
continuous variables based on the original EPR entanglement. The first ex- 
periment demonstrated the feasibility of transfer of fluctuations of a coherent 
state from one light beam to another. Although the experiment was limited 
to a narrow bandwidth of 100 kHz, this was only a technical limitation due 
to the detection electronics, the modulators and the bandwidth of the source 
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Fig. 3.19. Verification of the entanglement between photons 3 and 4. The sinusoidal 
dependence of the fourfold coincidence rate on the orientation O of the polarizer 
in arm 4 for ±45° polarization analysis of photon 3 demonstrates the possibility to 
teleport any arbitrary quantum state 



of EPR-entangled light beams. In principle, it soon should be possible also to 
transfer nonclassical states of light, such as squeezed light or number states. 



3.6 Outlook 

Quantum communication with entangled photons has shown its power and 
its fascinating features. Our experiments, where realistic entanglement-based 
quantum cryptography has been performed, where the capacity of commu- 
nication channels has been increased beyond classical limits and where the 
polarization state of a photon has been transferred to another one by means 
of quantum teleportation, are only the first steps towards the exploitation of 
new resources for communication and information processing. 

Quantum communication can offer a wealth of further possibilities, es- 
pecially when combined with simple quantum logic circuitry. Quantum com- 
puters have to operate on large numbers of qubits to really demonstrate their 
power. But quantum communication schemes already profit from combining 
only a few qubits and entangled systems. Quantum logic operations with sev- 
eral particles are already useful in examples of the quantum coding theorem 
[69] , but have shown their importance particularly in the proposals for entan- 
glement purification [119]. Any realistic transmission of quantum states will 
suffer from noise and decoherence along the line. If one wants to distribute 
entangled pairs of particles to, say, Alice and Bob, the entanglement between 
the received particles will be considerably degraded, which would prevent 
successful quantum teleportation, for example. If Alice and Bob now com- 
bine the particles of several such noisy pairs on each side by quantum logic 
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operations, they can improve the quality of entanglement by the proposed 
“distillation” process. 

These ideas are closely related to quantum error correction for quantum 
computers and were recently implemented in a proposal for efficient distri- 
bution of entanglement via so-called quantum repeaters [162]. One day, such 
systems might form the core of quantum networks [163] allowing quantum 
communication and even computation over large distances. Of course, one 
should always keep in mind the obstacles put in the way by the decoherence 
of quantum states [164]. However, quantum communication schemes should 
be significantly more stable owing to the much lower number of quantum 
systems involved. 

Once entangled particles have been distributed, various quantum com- 
munication protocols could be implemented. Besides those described in the 
preceding sections, there are some recent proposals that give a new twist 
to quantum information processing. Quantum gambling [165] and quantum 
games [166], e.g. a “quantized” version of the prisoner dilemma, bring the 
field of game theory to the quantum world and demonstrate new strategies 
in well-known classical games. But the new ideas and thoughts might also 
be quite useful for other types of communication problems. For example, the 
quantum version of “Chinese whispers” [167] can be also seen as a special 
type of error correction scheme. Errors in the classical communication, the 
whispering, can be more efficiently corrected if the sender and receiver have 
been provided with entangled pairs of particles. 

New possibilities arise if entangled triples of particles are used. For cer- 
tain tasks, the communication between three or more parties becomes less 
complex, and thus more efficient, if the parties share the entanglement ini- 
tially [168], and also schemes for quantum cloning [169] of the state of a qubit 
become feasible with entangled triples. 

Now that significant improvements of down-conversion sources [145, 148, 
149] and the first observation of three-particle entanglement [38, 170] have 
been achieved, the realization with entangled triples of those schemes that 
have previously used entangled pairs is within the reach of future experiments. 
For realizing entanglement purification and similar schemes, the experiments 
immediately become much more complex. It first has to be seen what methods 
can be used to perform quantum logic operations with photons, and also what 
types of photon sources should be used then. However, the combined progress 
in the form of improving experimental techniques and of better understanding 
of the principles of quantum information theory makes the more complicated 
schemes feasible. Most likely, there will also be novel schemes for quantum 
communication using higher numbers of qubits and/or even more complex 
types of entanglement. 

Quantum cryptography was the first quantum communication method to 
literally leave the shielded environment of quantum physics laboratories and 
to become a promising candidate for commercial exploitation. We expect 
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that the future will show an enormous potential for and benefit from the 
use of other quantum communication methods, such as the distribution of 
entanglement over large distances and the transfer of quantum information 
in the process of quantum teleportation. 
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4.1 Introduction 

Classical computer science relies on the concept of Turing machines as a 
unifying model of universal computation. According to the modern Church- 
Turing Thesis, this concept is interpreted in the form that every physically 
reasonable model of computation can be ejficiently simulated on a proba- 
bilistic Turing machine. Recently this understanding, which was taken for 
granted for a long time, has required a severe reorientation because of the 
emergence of new computers that do not rely on classical physics but, rather, 
use effects predicted by quantum mechanics. 

It has been realized that, by using the principles of quantum mechanics, 
there are problems for which a putative quantum computer could outperform 
any classical computer. Quantum algorithms benefit from the application of 
the superposition principle to the internal states of the quantum computer, 
which are considered to be states in a (finite-dimensional) Hilbert space. As 
a result, these algorithms lead to a new theory of computation and might be 
of central importance to physics and computer science. Striking examples of 
quantum algorithms are Shor’s factoring algorithm, Grover’s search algorithm 
and algorithms for quantum error-correcting codes, all of which will be part 
of this contribution. 

We shall introduce the complexity model of quantum gates, which are 
most familiar to researchers in the field of quantum computing, and shall give 
many examples of the usefulness and conciseness of this formalism. Quantum 
circuits provide a computational model equivalent to quantum Turing ma- 
chines. This means that, very much like the situation in classical computing, 
there are several ways of describing computations by appropriate theoretical 
models. 

Amongst such quantum circuits, quantum signal transforms form basic 
primitives in the treatment of controlled quantum systems. A surprising and 
important result, in view of the algorithms of Shor, is the fact that it is 
possible to compute a Fourier transform (of size 2") on a quantum computer 
by means of a quantum circuit which requires only O(n^) basic operations. 
This is a substantial speedup compared to the classical case, where the fast 
Fourier transform [171] yields an algorithm that requires 0(n2") arithmetic 
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operations. Applications of such Fourier transforms to finite abelian groups 
arise in the algorithms of Simon and Shor. We shall present these algorithms 
and the underlying principle leading to their surprisingly fast solution on a 
quantum computer. 

As already mentioned, one of the basic results is that in the complexity 
model of quantum circuits, the Fourier transform can be realized with an 
exponential speedup compared with the classical case. However, in the quan- 
tum regime the only way to extract information from a system is to make 
measurements and thereby project out nearly all aspects of the whole system. 
Thus, the art and science of designing quantum algorithms lies in the ability 
to obtain enough information from measurements, i.e. to choose the right 
bases from which relevant information can be read off. On the basis of the 
example of the Fourier observable, which represents the most important case 
of such a base change, we explain the underlying principle by means of the 
so-called hidden-subgroup algorithms and present an analysis of sampling in 
the Fourier basis with respect to the appropriate groups. 

We then show how recent results in the theory of signal processing (for a 
classical computer) can be applied to obtain fast quantum algorithms for var- 
ious discrete signal transforms, including Fourier transforms for nonabelian 
groups. Finally, we give a brief introduction to the theory of (quantum) error- 
correcting codes and their algorithmic implementation. 



4.2 Architectures and Machine Models 

The definition of an architecture and a machine model, on which the compu- 
tations are considered to be carried out, is indispensable if one is to have a 
common computational model for which algorithms can be devised. 

Each reasonable model of computation should give us the possibility of 
performing arbitrary operations, up to a desired accuracy, on the system by 
execution of elementary operations. By counting the elementary operations 
necessary to complete a given task, we arrive at complexity models. Finally, 
if different approaches defining universal computational models are possible, 
it is desirable to show the equivalence of these models, in the sense that they 
can simulate each other with a slowdown that is polynomial in the size of the 
input. In the case of quantum computing, we give two models for universal 
quantum computation, namely quantum networks in Sect. 4.2.1 and quantum 
Turing machines in Sect. 4.2.6. We shall put more emphasis on gates and 
networks, relying on the result that these two models are equivalent in the 
sense described. 

One remark concerning the architecture is in order: we restrict ourselves 
to the case of operational spaces with a dimension that is a power of two, 
which are called qubit architectures. These systems incorporate the features 
necessary to do quantum computing, i.e. superposition of an exponentially 
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Fig. 4.1. Elementary quantum gates 



growing number of states, interference between computational paths and en- 
tanglement between quantum registers. 

It is possible to perform an embedding of an arbitrary finite-dimensional 
operational space into a qubit architecture (see Sect. 4.2.3); however, this 
reduction involves a suitable encoding of the states of the system into the 
basis states of the qubit architecture, and hence genuine properties of the 
system might be lost by this procedure. 

4.2.1 Quantum Networks 

The state of a quantum computer is given by a normalized vector in a Hilbert 
space 7^2" of dimension 2", which is endowed with a natural tensor structure 
7^2" = (n factors). The standard basis for this Hilbert space is the 

set {|a:) : x G Z 2 } of binary strings of length n. Restricting the computational 
space to Hilbert spaces of this particular form is motivated by the idea of a 
quantum register consisting of n quantum bits. A quantum bit, also called a 
qubit, is a state corresponding to one tensor component of 7^2" and has the 
form 

|</j) = a|0) -F/3|l) , a,fiGC, \a\^ + = I . 

The possible operations that this computer can perform are the elements 
of the unitary group 77(2"). To study the complexity of performing unitary 
operations on n-qubit quantum systems, we introduce the following two types 
of computational primitives: local unitary operations on a qubit i are matrices 
of the form = 12^-1 ® C/ ® l 2 "-q where U is an element of the unitary 
group 77(2) of 2 X 2 matrices and Iat denotes the identity matrix of size 
N. Furthermore, we need operations which affect two qubits at a time, the 
most prominent of which is a so-called controlled NOT gate (also called a 
measurement gate) between the qubits j (control) and i (target), denoted by 
CNOT^*’-^^ On the basis vectors |x„, . . . , x\) of 772™, the operation CNOT^*’-’^ 
is defined by 

\Xn , . . . , , Xi , Xi —\ , . . . , a:i) \xn, ■ ■ . ,Xi+i,Xi © Xj,Xi-i ,. . . ,xi) , 

where the addition © is performed in Z 2 . 
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In graphical notation, using quantum wires, these transformations are 
written as shown in Fig. 4.1. Lines correspond to qubits, unaffected qubits 
are omitted and a dot • sitting on a wire denotes a control bit. Note that 
we draw the qubits according to their significance, starting with the most 
significant qubit on top. Quantum circuits are always read from left to right. 

The two types of gates shown in Fig. 4.1 suffice to generate all unitary 
transformations, i.e. they form a universal set of gates. This is the content of 
the following theorem [172]. 

Theorem 4.1. := {t/«, CNOT^*’^) | t/ e Z^(2), i, j e {1, . . . , n}, i ^ j} 

is a generating set for the unitary group 7Y(2"). 

This means that for each U G W(2") there is a word wiW 2 ■ ■ - Wk (where 
Wi G Qi for i = l,...,fc is an elementary gate) such that U factorizes as 
U = wiW 2 ■ ■ - Wk- On the basis of theorem 4.1, we now define a complexity 
measure for unitary operations on qubit architectures. 

Definition 4.1. Let U G W(2”) be a given unitary transformation. Then 
k{U) is defined as the minimal number k of operations in Qi necessary to 
write U = W\W 2 ■ ■ - Wk as a sequence of elementary gates. 

For the complexity measure k, the following holds: ® B) < k{A) + 

k{B) for all A G W(2"’Q and B G because tensor products are free 

of cost in a computational model based on quantum mechanical principles. 
Also, by concatenation of operations, we obtain k(A ■ B) < n{A) + k{B) for 
A,Bg W(2"). Note that whereas in the usual linear complexity measure 
[173, 174] permutation matrices are free (i.e. Lc{tt) = 0, for all tt G S'„), 
we have to take them into account when using the complexity measure k. 
Instead of the universal set of gates Qi we can, alternatively, use the set 
G 2 ■= : U G i,j G {l,...,n}, i yf j} of all two-bit gates, 

changing the value of k by only a constant. 

Whereas the complexity measure k is used in cases where we want to 
implement a given unitary operation U exactly in terms of the generating 
sets Qi and Q 2 , it is also expedient to consider unitary approximations by 
quantum networks. By this we mean a sequence of operations w\, . . . ,Wn 
which approximates t/ up to a given e, i.e. such that ||t7— rcj]] < e, where 
II • II denotes the spectral norm.^ We denote the corresponding complexity 
measure by k^. 

Remark 4 ..I. The following facts concerning approximation by elementary 
gates are known: 

• There are two-bit gates which are universal [175, 176], i.e. there exists a 
unitary transformation A gU{A) with respect to which it is possible to 
approximate any given U up to e > 0 by a sequence of applications of A 



^ Recall that the spectral norm of a matrix A G jg given by max>,ggpec(A) 1-^1- 
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to two tensor components of only: \\U — 0^=1 < e. Even 

though the much stronger statement of universality of a generic two-bit 
gate is known to be true, it is hard to prove the universality for a given 
two-bit gate [176, 177]. 

• Small generating sets are known; for instance, we can choose 



where the first two operations generate a dense subgroup in U{2). The 
Hadamard transformation, which is part of this generating set, is denoted 



and is an example of a Fourier transformation on the abelian group Z 2 
(see Sect. 4.4). 

• Knill has obtained a general upper bound 0(n4") for the approximation 
of unitary matrices using a counting argument [178, 179]. 

• We cite the following approximation result from Sect. 4.2 of [180]: Fix 

a number n of qubits and suppose that (Xi,...,Xr) = 57/(2”), i.e. 
Xi^. . . ^Xr generate a dense subgroup in the special unitary group SU (2") . 
Then it is possible to approximate a given matrix U G 57/(2”) with 
given accuracy e > 0 by a product of length 0{poly[log(l/e)]}, where the 
factors belong to the set {X \, . . . , X ^^, . . . , X~~^}. Furthermore, this 

approximation is constructive and efficient, since there is an algorithm 
with running time 0{poly[log(l/e)]} which computes the approximating 
product. However, we remind the reader that this holds only for a fixed 
value of n; the constant hidden in the O-calculus grows exponentially 
with n (see Theorem 4.8 of [180]). 

From now on, we put the main emphasis on the model for realizing uni- 
tary transformations exactly and on the associated complexity measure k. 
In general, only exponential upper bounds for the minimal length occuring 
in factorizations are known. However, there are many interesting classes of 
unitary matrices in 7/(2”) that lead to only a polylogarithmic word length, 
which means that the length of a minimal factorization grows asymptotically 
like 0[p(n)], where p is a polynomial. 

In the following we give some examples of transformations, their factor- 
ization into elementary gates and their graphical representation in terms of 
quantum gate arrays. The operations considered in these examples admit 
short factorizations and will be useful in the subsequent parts of this chap- 
ter. 




by 




Example 4-1 (Permutation of Qubits). The symmetric group 5„ is embedded 
in 7/(2”) by the natural operation of 5„ on the tensor components (qubits). 
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Let T € Sn and let Ur be the corresponding permutation matrix on 2 " 
points. Then k{Ut) = 0{n): to prove this, we first note that each element 
a G Sn can be written as a product ct = ti • T 2 of two involutions ti and 
T 2 G Sn, i.e. = r| = id. To see that it is always possible to find a suitable 
Ti and T 2 , we can, when considering the decomposition of a into disjoint 
cycles, restrict ourselves to the case of an n-cycle. Now the decomposition 
follows immediately from the fact that there is a dihedral group of size 2n 
containing a as the canonical n-cycle and that this rotation is the product of 
two reflections. 

The unitary transformation corresponding to 77,-, where t G Sn is an 
involution, can be realized by swappings of quantum wires, which, in turn, 
can be performed efficiently and in parallel. To swap two quantum wires we 
can use the well-known identity 7T(i 2 ) = CNOT*-^’^^ • CNOT^^’^^ • CNOT^^’^\ 
yielding a circuit of depth three. Writing an arbitrary permutation 11^ of the 
qubits as a product of two involutions, we therefore obtain a realization by a 
circuit of depth six at most (see also [181]). 

As an example, the permutation (1, 3, 2) of the qubits (which corresponds 
to the permutation (1, 4, 2)(3, 5, 6) on the register) is factored as (1,3,2) = 
(1,2)(2,3) (see Fig. 4.2). 








^ 1 c 




-< 






5 — 




U 



Fig. 4.2. Factorization (1,3,2) = (1,2)(2,3) 



Example J^.2 ( Controlled Operations) . Following [172], we introduce a special 
class of quantum gates with multiple control qubits, yielding a natural gen- 
eralization of the controlled NOT gate. This class of gates is given by the 
transformations Ak{U), where f7 is a unitary transformation in U{2^). The 
gate Ak(U) is a transformation acting on fc -|- / qubits, where the k most 
significant bits serve as control bits and the I least significant bits are target 
bits: the operation U is applied to the I target bits if and only if all k control 
bits are equal to 1. Denoting by M := 2^(2^— 1) the number of basis vectors 
on which Ak{U) acts trivially, the corresponding unitary matrix is given by 
1m © U, where we have used © to denote a direct sum of matrices. 

To provide further examples of the graphical notation for quantum cir- 
cuits, we give in Fig. 4.3 a Ai{U) gate for U G 77(2") with a normal control 
qubit, a gate Ai(JJ) with an inverted control qubit, and the matrices repre- 
sented. Lemmas 7.2 and 7.5 of [172] show that for U G 77(2), the gate Ak{U) 
can be realized with gate complexity 0(n), for k < n—1. If there are auxiliary 
qubits (so-called ancillae) available, a gate A„_i([/) can also be computed 
using 0(n) operations from Qi; otherwise, we have k[A„_i(77)] = 0{n^). 
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Fig. 4.3. Controlled gates with (left) normal and {right) inverted control bit. Here 
0 is used to denote the block-direct sum of matrices 



We remark that, if t/ G Z^(2") can be realized in p elementary operations 
then, Ai{U) € U{2'^^^) can be realized in exp basic operations, where c S N is 
a constant that does not depend on U . To see this, we first assume, without 
loss of generality, that U is decomposed into elementary gates. Therefore, 
we have to show that a doubly controlled NOT (also called a Toffoli gate, 
see Sect. 4.2.3) and a singly controlled U tiU{ 2 ) gate can be realized with a 
constant increase of length. It is possible to obtain the bound c < 17 according 
to the following decompositions [172]: for each unitary transformation U G 
U{ 2 ) we can write Ai{U) = AA) ■ CNOT^^’^) . 5(1) . CNOT^^’^) . ^(1) with 
suitably chosen A,B,C G 7/(2), i.e. we need at most five elementary gates 
for the realization of Ai(l7). To bound the number of operations necessary 
to realize r := A2(f7x) with respect to the set ^ 1 , we choose a square root i? 
of (Jx, i.e. = ax, and use the identity 

r = [I 2 0 Ai{R)] ■ CN0T(2’3) . ^ Hi(i?1')] • CNOT^^’^) . ^ 

This shows that we need at most 5+1 + 5 + 1 + 5 = 17 elementary gates to 
realize r. 

Example 4-3 (Cyclic Shift). Let G 82 ”- be the cyclic shift acting on the 
states of the quantum register as x 1 -^ a: + 1 mod 2". The corresponding 
permutation matrix is the 2"-cycle (0, 1, . . . , 2"— 1). The unitary matrix 
can be realized in a polylogarithmic number of operations; see Fig. 4.4 for a 
realization using Boolean gates only. 
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Fig. 4.4. Realizing a cyclic shift on a quantum register 
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Other, non-Boolean factorizations are also possible: using a basic fact 
about group circulants (see also Sect. 4.5.1) and anticipating the fact that 
the discrete Fourier transform DFT 2 " can be performed in O(n^) operations 

(which is shown in Sect. 4.4.1), we can use the identity 
DFT 2 J • • DFT 2 - = diag(w^„ : i = 0, . . . , 2” - 1) 

= diag(l,W2r. ) O • • • 0 diag(l,a;2") 
to obtain k(P„) = O(n^). 

4.2.2 Boolean Functions and the Ring Normal Form 

Boolean functions are important primitives used throughout classical infor- 
matics. Denoting the finite field with q elements^ by GF{q), we obtain the 
Boolean numbers as the special case q = 2. A multivariate Boolean function 
/ : GF(2)" ^ GF{2) can be represented in various ways. Besides the truth 
table, which is a common but uneconomic way to represent / as the sequence 
of its values /(0...0),...,/(l...l) for all binary strings of length n, promi- 
nent examples of normal forms are the conjunctive and disjunctive normal 
forms [183] which originate from predicate logic and are used in transistor 
circuitry. 

For quantum computational purposes, another way of representing / offers 
itself, namely the ring normal form (RNF), defined as the (unique) expansion 
of / as a polynomial in the ring of Boolean functions of n variables. This 
ring is defined by i?„ := GF{2)[Xi , . . . ,X„]/(Xj - X„) [184]. 

Multiplication and addition in i?„ are the usual multiplication and addition of 
polynomials modulo the relations given by the ideal {Xf — Xi,..., X^ — X„), 
and addition is usually denoted as “0” . Therefore / is represented as 

n 

/(Xi,...,X„) := 0 (4-1) 

U=(Rl,...,Rn)€{0,l}”^ 

with coefficients S GF{2). 

Example 4-4- The logical complement is given by NOT(X) = 1 0 X. The 
RNF of the AND function on n variables Xi, . . . , X^ is AND(Ai, . . . , X„) = 
Xi. The RNF of the PARITY function of n variables is given by 
PARITY) Ai,...,X„) = Finally, the RNF of the OR function 

on n variables is given by 

n 

OR(Yi, . . . , A„) = 1 0 1[{1 0 A,) = 0 m(Ai, . . . , A„) , 

2=1 m 



^ Necessarily, we have q = p", where p is a prime and n > 1 [182]. Finite fields are 
also called Galois fields after Evariste Galois (1811-1832). 
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where the last sum runs over all nonconstant multilinear monomials m in the 
ring Rn- 

We can implement a Boolean function given in the RNF shown in (4.1) 
with the use of the so-called Toffoli gate r. The action of r on the basis states 
of the Hilbert space Tig is given by r : |a:)|t/)|z) \x)\y)\z 0 x • y) [185]. 

Horner’s rule for multivariate polynomials yields a method for implement- 
ing the function / given in (4.1). To achieve this, we write f{Xi, . . . ,X„) = 
. . . , Xn-i) ■ Xn © f 2 {Xi, . . . , Xn-i) and observe that this function can 
be computed using one Toffoli gate, assuming that /i and /2 have already 
been computed. Therefore, we obtain a recursive factorization for /, which 
in general will make use of auxiliary qubits. 

4.2.3 Embedded Transforms 

This section deals with the issue of embedding a given transform A into a 
unitary matrix of larger size. We start by considering the problem of realizing 
a given matrix H as a submatrix of a unitary matrix of larger size. The 
following theorem (see also [186]) shows that the only condition A has to 
fulfill in order to allow an embedding involving one additional qubit is to be 
of bounded norm, i.e. ||H|| < 1, with respect to the spectral norm. 

Theorem 4.2. Let A € he a given matrix of norm ||H|| < 1. Then 

jj _( ^ (l„-HHt)V 

V(ln-^U)V2 _At 

yields a unitary matrix Ua G U{2n) which contains A as the nxn submatrix 
in the upper left corner. 

Proof. Observe that the n x 2n matrix Ui := (H, (1„ — has the 

property U\ - Ui = 1„. Analogously, for the matrix U 2 '.= [A, (1„ — AAl)^/^], 
the identity U 2 ■ U\ = holds. An easy computation shows that (4.2) is 
indeed unitary. □ 

Since each matrix in C"’'" can be renormalized by multiplication with a 
suitable scalar to fulfill the requirement of a bounded norm, we can realize 
all operations up to a scalar prefactor by unitary embeddings. The embed- 
ding (4.2) is by no means unique. However, it is possible to parametrize all 
embeddings by (In © Vi) • Ua ■ (In © V 2 ), where Vi, V 2 G U{n) are arbitrary 
unitary transforms. 

We are naturally led to a different kind of embedding if the given trans- 
formation is unitary and we want to realize it on a qubit architecture, i.e. if 
we restrict ourselves to matrices whose size is a power of 2. Then, a given 
unitary matrix U G U {N) can be embedded into a unitary matrix in U (2”) by 
choosing n = [logiV] and padding U with an identity matrix 12 "-tv of size 
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2" — N. This is noncanonical since we have degrees of freedom in the choice 
of the subspace on which this newly formed matrix acts as the identity. 

In general, it is a difficult problem to find the optimal embedding for a 
given transform. An example in which it is not natural to go to the next 
power of 2 is given by C/ = U\ ® U 2 & U{lh), where Ui G U{‘3) and U 2 G 
U{b). We then have the possibilities f/ © li G and the embedding 

{U\ © li) © (f /2 © I 3 ) G U{2^), which respects the tensor decomposition of U. 

A third type of embedding occurs in the context of quantum and reversible 
computing, where a general method is required to make a given map / : 
A — > y bijective (here X and Y are finite sets). If we consider the map 
f : X X Y — s-AxT which maps (x,yo) >— *■ {x,f{x)), where yo is a fixed 
element in the codomain Y of /, then this map is obviously injective when 
restricted to the fibre X x {yo}- Observe now that it is always possible to 
extend f\xx{yo} to a unitary operation on the Hilbert space Hx,y spanned 
by the basis consisting of {|a:)|y) : x £ X,y £ Y}. Hence, it is always possible 
to construct a unitary operation Vf : Hx,y — *■ 'Hx,y which has the property 

|a:)|0) 1 -^ |a;)|/(a;)) , for all x € X , (4.3) 

i.e. Vf implements the graph Ff = {{x,f{x)) : x G X} of / in the Hilbert 
space TLx,y ■ Here, we have identified the special element yo with the basis 
vector |0) G Hy- 

Example 4-5- Let / : GF{2)‘^ — > GF{2) be the AND function, i.e. let / = x-y 
be the RNF of /. Note that / can be chosen to be the function (x,y,z) 1 — > 
(x,y,z © f{x,y)) since the codomain is endowed with a group structure. 
Overall, we obtain the function table of / given in Fig. 4.5. The variables 
with a prime correspond to the values after the transformation has been 
performed. 
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Fig. 4.5. Truth table for the Toffoli gate and the corresponding quantum circuit 



We recognize / as the unitary operation r : |a;)|y)|z) |x)|y)|z © a; • y) on 

the Hilbert space Tig, which is the Toffoli gate [185]. 

The method described in Example 4.5 is quite general, as the following 
theorem shows (for a proof see [187]). 
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Theorem 4.3. Suppose f : {0, 1}" ^ {0, 1}™ is a Boolean function which 
can he computed using c operations from the universal set {AND, NOT} 
of classical gates. Then f : {0,1}"+™ ^ {0,1}"+™, defined by (x,y) 

(x, y 0 /(x)), is a reversible Boolean function which can be computed by a 
circuit of length 2c + m built up from the set {CNOT, r} of reversible gates. 

Even though the construction described in Theorem 4.3 works for arbi- 
trary / : {0,1}" ^ {0,1}™, in general only r/ = |■logmaxJ^g{o,l}™ l/"^(y)ll 
additional bits are necessary to define a reversible Boolean function /rev : 
{0, 1}"+"/ — > {0, !}"+’■/ with the property /rev|{o,i}" = /• The reason is that 
by using the additional ry bits, the preimages of / can be separated via a 
suitable binary encoding. However, the complexity of a Boolean circuit of a 
realization of /rev constructed in such a way is such that the circuit cannot 
be controlled as easily as for the function defined in Theorem 4.3. 

4.2.4 Permutations 

We have already mentioned in Sect. 4.2.1 that on a quantum computer per- 
mutations of the basis states have to be taken into account when considering 
the complexity: in general, for the cost where tt is a permutation matrix 
in and k is the complexity measure introduced in Sect. 4.2.1, nothing 

better is known than an exponential upper bound of 0(n4"). 

Nevertheless, there are quite a few classes of permutations admitting a 
better, even polylogarithmic word length, as the examples of permutations of 
quantum wires and of the cyclic shift : x x + 1 mod 2" on a quantum 
register have shown (see Examples 4.1 and 4.3). 

In what follows we consider a further class of permutations of a quantum 
register that admits efficient realizations, which operate by linear transfor- 
mations on the names of the kets. Recall that the basis states can be iden- 
tified with the binary words of length n and hence, with the elements of 
GF(2)", the n-dimensional vector space over the finite field GF{2) of two el- 
ements. Denoting the group of invertible linear transformations of GE(2)" by 
GL(n, GF(2)), we see that each transform A G GL(n, GF{2)) corresponds to 
a permutation of the binary words of length n and, hence, to a permutation 
matrix <d>A of size 2" x 2". 

It turns out that these permutations are efficiently realizable on a quan- 
tum computer (see Sect. 4 of [188]). First we need the following lemma. 

Lemma 4.1. Let K be a field and let A G GL(n, AT) be an invertible matrix 
with entries in K . Then there exist a permutation matrix P , a lower trian- 
gular matrix L and an upper triangular matrix U such that A = P ■ L ■ U . 

In numerical mathematics this decomposition is also known as the “LU 
decomposition” (see, e.g., Sect. 3.2. of [189]). The statement is a consequence 
of Gauss’s algorithm. We are now ready to prove the following theorem. 
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Theorem 4.4. Given A £ GL(n, GF(2)), <Pa can be realized in O(n^) ele- 
mentary operations on a quantum computer. 

Proof. First, decompose A according to Lemma 4.1 into A = P ■ L ■ U and 
observe that the permutation matrix P is a permutation of the quantum 
wires, and hence n{P) = 0{n) (see Example 4.1). The matrices L and U can 
be realized using CNOT gates only. Without loss of generality, we consider the 
factorization of L\ proceeding along the diagonals of these matrices, we find 
all diagonal entries to be 1 (otherwise the matrices would not be invertible) . 
Therefore maps the basis vector je^), where = (0 ... 1 ... 0) is the fth 
basis vector in the standard basis of GP(2)", to the sum where 

the vector (o:i)i=i^,..^„ is the ith column of A. Application of the sequence 
rij>i CNOT*'*’'^^ where the product runs over all j yf 0, has the same effect 
on the basis vector e^. 

Proceeding column by column in L yields a factorization into 0{n^) el- 
ementary gates. Combining the factorizations for P, L and U, we obtain 
/t(A) = 0(n2). □ 

As an example, we take a look at the matrix 
(^1 J) GGL(2,GP(2)) . 

To see what the corresponding <I>a looks like, we compute the effect of A on 
the basis vectors: 




i.e. <Pa = CNOT^^’^) in accordance with Theorem 4.4, since A is already 
lower triangular. 

As further examples of permutations arising as unitary transforms on a 
quantum computer, we mention gates for modular arithmetic [190, 191]. More 
specifically, we consider the following operation, acting on kets which have 
been endowed with the group structure of Z^r := Z/A^Z, i.e. {|x) : x £ Z^r} 
is a basis of this operational space Ti. Then 

Ta : |x)|0) I— *■ |a:)|a • x mod N) , 

where a € Z)(f is an element of the multiplicative group of units in Z^v; 
this can be extended to a permutation of the whole space 7i ®7i using the 
methods of Sect. 4.2.3. Using a number of ancilla qubits which is polynomial 
in log[dim(7f)], it is possible to realize Ta, as well as other basic primitives 
known from classical circuit design [183], such as 

• adders modulo N\ |a:)ly) |a:)la; -b y mod N) 

• modular exponentiation: |x)j0) jx)ja“ mod N), 

in polylogarithmic time on a quantum computer [190, 191]. 
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4.2.5 Preparing Quantum States 

If we are interested in preparing particular quantum states by means of an ef- 
fective procedure, in most cases it is straightforward to write down a quantum 
circuit which yields the desired state when applied to the ground state |0). 
For instance, by means of the quantum circuit given in Fig. 4.6, a Schrodinger 
cat state |!F„) on n qubits can be prepared using n+1 elementary gates. These 
are the states 

!»>„):= + 

n zeros n ones 

and we remind the reader that {^ 2 } is locally equivalent to a so-called EPR 
state [24] and jiFa) is a so-called GHZ state [37]. 



-H2 












u 

— ( 





| 0 ) ... 

Fig. 4.6. Quantum circuit that prepares a cat state 



We remark that there is an algorithm to prepare an arbitrary quantum 
state \tp) starting from the ground state |0), i.e. to construct a quantum 
circuit yielding U^\0) = ](/?). 

Algorithm 1 Let \lp) = ® quantum state which we would 

like to prepare. Do the following in a recursive way. Write 

W) = a|0)l'Fo) + &|l)l¥’i) , 

where a and b are complex numbers fulfilling |a|^ -|- |6|^ = 1. The states 
\To)-= Wi) ■= 

appearing on the right-hand side of the above equation can be prepared by in- 
duction hypotheses using the circuits U\ and U 2 . Let A be the local transform 

Then ]</?) can be prepared by application of the circuit A ■ yli(17i) • Ai(U 2 ) to 
the ground state |0). 
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In general, the quantum circuit t/<^ for preparing a state \ip) € 7^2" generated 
by this algorithm has a complexity n{U^) = 0(2"), which is linear in the 
dimension of the Hilbert space but exponential in the number of qubits. 
However, as the example of cat states previously mentioned shows, there are 
states which admit much more efficient preparation sequences. In such a set 
of states, we also find the so-called symmetric states [192] 

, ^ (jOO . . . 0) -h [10 . . . 0) -h |01 . . . 0) -h . . . -h [00 . . . 1)) , 

+ 1 

i.e. the union of the orbits of jOO . . . 0) and jlO . . . 0) under the cyclic group 
acting on the qubits. As shown in Sect. 4 of [192], these states can be prepared 
using 0(n) operations and a quadratic overhead of ancilla qubits. 

Finally, we give circuits for preparation of the states \-ijju) '■= (1/v^) Sfci I*) 
for V = 1,...,2", which represent equal amplitudes over the first v basis 
states of 7^2". The states ["^i,) can be efficiently prepared from the ground 
state |0) by the following procedure (using the principle of binary search 
[193]), which is described in Sect. 4 of [187]. 

Since ['02") can easily be prepared by application of the Hadamard trans- 
formation 77®", we can assume r/ < 2" without loss of generality. We now 
choose fc S N such that 2^ < u < 2*^+^ and apply the transformation 

J_ / -Viy-2>^\ 

y yjv — 2^ J 

to the first bit of the ground state |0). Next we achieve equal superposition on 
the first 2^' basis states [0 ... 0), ..., [0 ... 01 ... 1) by application of an [n — k)- 
fold controlled Hi (77®^) operation, which can be implemented using O(n^) 
operations. Finally, we apply the preparation circuit for the state 
(which has been constructed by induction), conditioned on the ik + l)th 
bit. Overall, we obtain a complexity for the preparation of \'4’v) of O(n^) 
operations. 

4.2.6 Quantum Turing Machines 

Quantum circuits provide a natural framework to specify unitary transforma- 
tions on finite-dimensional Hilbert spaces and give rise to complexity models 
when factorizations into elementary gates, e.g. with respect to the universal 
sets Qi or Q 2 , are taken into account. 

Besides the formalism of quantum networks, there are other ways of de- 
scribing computations performed by quantum mechanical systems. In the 
following we briefly review the model of quantum Turing machines (QTMs) 
defined by Deutsch [7]. We remind the reader that Turing machines [194] 
provide a unified model for classical deterministic and probabilistic compu- 
tation (see, e.g., [195]). The importance of Turing machines as a unifying 
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Fig. 4.7. Configurations of a Turing machine 

concept for classical computing manifests itself in the Church-Turing thesis 
[196, 197], which, in its modern form, claims that every physically reasonable 
model of computation can be efficiently simulated on a probabilistic Turing 
machine. 

Definition 4.2. A deterministic Turing machine T is defined by the data 
{Q, E, go, F,t), where Q is a finite set of states, E a set of symbols, go G Q 
a distinguished initial state, F <G Q the set of final states, and t : Q x E — > 
Q X E X {<—,!,—>} the transition function. The admissible actions of T are 
movements of a read-write head, which in one computational step can move 
to the left, stay where it is or move to the right (we have denoted these actions 
by {<— , I, — >} ). Along with the Turing machine T comes a infinite tape of cells 
(the cells are in bijection with Z) which can take symbols from E. 

A configuration of a Turing machine T is therefore given by a triplet 
{v,p,g) G {E'^,Z,Q), consisting of the state v of the tape, the position p 
of the head and the internal state g. We obtain a tree of configurations by 
considering two configurations ci and C 2 to be adjacent if and only if C 2 is 
obtained from ci by an elementary move, i.e. scanning a symbol from the 
tape, changing the internal state, writing back a symbol to the tape and 
moving the head. The initial state is the root of this tree, whereas the final 
states constitute its leaves (see Fig. 4.7). 

A probabilistic Turing machine differs from a deterministic one only in the 
nature of the transition function t, which is then a mapping 

t-.QxExQxEx {^, i, — > [0, 1] , 

which assigns probabilities from the real interval [0, 1] to the possible actions 
of T . A normalization condition, which guarantees the well-formedness of a 
probabilistic Turing machine, is that for all configurations the sum of the 
probabilities of all successors is 1. Therefore, the admissible state transitions 
of a probabilistic Turing machine T can be described by a stochastic matrix 
St G [0, 1]^^^, where stochasticity means that the rows of St add up to 1 
and the successor Csucc of c is obtained by Csucc = St ■ c. Note that a deter- 
ministic Turing machine is a special case of a probabilistic Turing machine 
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with a “subpermutation” matrix St, i-e. St can be obtained from a suitable 
permutation matrix by deleting some of its rows. 

One more variation of this idea is needed to finally arrive at the concept 
of a quantum Turing machine: we have to require that the transition function 
t is a mapping with a normalization condition 

t-.QxExQxSx{^,i,^}^C (4.4) 

from a configuration to possibly many successors, each of which is given a 
complex amplitude. Here the normalization constraint says that the matrix 
Ut describing the dynamics of T on the state space is unitary. 

Observe that there is one counterintuitive fact implied by this definition: 
as t assigns complex amplitudes toQxT’xQxT’xl^,!,^} (according to 
(4.4)), one can interpret a configuration of T as being in a superposition of 
(i) tape symbols in each individual cell, (ii) states of the finite-state machine 
supported by Q and (iii) positions of the head. The last point in particular 
might look uncomfortable at first sight, but we remind the reader of the fact 
that in classical probabilistic computation each individual configuration is 
assigned a probability, so one can think of a probabilistic computation as 
traversing an exponentially large configuration space! The main difference 
of the QTM model is that because of negative amplitudes, computational 
paths in this configuration space can cancel each other out, i.e. the effects of 
interference can force the Turing machine into certain paths which ultimately 
may lead to the desired solution of the computational task. 

Now that the computational model of a QTM has been defined, the ques- 
tion arises as to what can be computed on a QTM, compared with a classical 
deterministic or probabilistic Turing machine. An important result in this 
context is that everything which can be computed classically in polynomial 
time can also be computed on a QTM because of the following theorem, which 
relies on some results of Bennett for reversible Turing machines [55, 198, 199] 
and was adapted to the QTM setting in [200]. As usual, we denote by L* the 
language {0,1}* consisting of all binary strings and denote by |a;| the length 
of the word x £ L*. 

Theorem 4.5. Let f : L* ^ L* be a polynomial-time eomputable funetion 
such that |/(a:)| depends only on |a:|. Then there is a polynomial-time QTM 
T that computes |a;)|0) i— *■ |x)|/(x)). The running time of T depends only on 
\x\. 

Proof. The basic idea is to replace each elementary step in the computation 
of / by a reversible operation (using theorem 4.3), keeping in mind that an- 
cilla qubits are needed to make the computation reversible (see Sect. 4.2.3). 
We now adjoin additional qubits to the system, which are initialized in the 
ground state |0), and apply a controlled NOT operation using the compu- 
tational qubits holding the result /(x). Of course, after the application of 
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this operation the state is highly entangled between the computational reg- 
ister and the additional register holding the result. Next, we run the whole 
computation that was done to compute /, backwards, reversibly, on the com- 
putation register to get rid of the garbage which might destroy the coherence, 
and end up with the state |x, 0, /(x)), where the |0) refers to the ancilla bits 
used in the first step of this procedure. □ 



Remark 4-2. 

• The class of quantum Turing machines allows the definition and study 
of the important complexity class BQP [200],^ as well as the relation of 
BQP to other classes known from classical complexity theory. 

• There are programming primitives for QTMs, such as composition, loops 
and branching [200], as in the classical case. However, a problem arises 
in realizing while- loops, since the predicate which decides whether the 
loop terminates can be in a superposition of true and false, depending on 
the computation path. Therefore all computations have to be arranged 
in such a way that this predicate is never in a superposed state, i.e. the 
state of the predicate has to be classical. As a consequence, we obtain 
the result that a quantum Turing machine can only perform loops with 
a prescribed number of iterations, which in turn can be determined by a 
classical Turing machine. 

• An important issue is whether QTMs constitute an analog or discrete 
model of computation. One might be tempted to think of the possibil- 
ity of encoding an arbitrary amount of information into the transition 
amplitudes of t, i.e. of producing a machine model which could benefit 
from computing with complex numbers to arbitrary precision (for the 
strange effects of such models see, e.g., [201]). However, see [200, 202] for 
a proof of the fact that it is sufficient to take transition amplitudes from 
the finite set {±3/5, ±4/5, ±1, 0} in order to approximate a given QTM 
to arbitrary precision. The reason for this is that the Pythagorean-triple 
transformation 



1 

5 



3 -4 

4 3 



€ SO(2) 



has eigenvalues of the form with ly ^ Q and, therefore, generates a 
dense subgroup in SO (2). The statement then follows from the fact that 
the full unitary group on 7^2" can be parametrized by SO (2) matrices 
applied to arbitrary basis states and phase rotations [172, 203]. 

• Yao has shown [204] that the computational models of QTM and uniform 
families of quantum gates (see Sect. 4.2.1) are polynomially equivalent, 
i.e. each model can simulate the other with polynomial time overhead. 

^ BQP stands for “bounded-error quantum polynomial time”. 
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4.3 Using Entanglement for Compntation: 

A First Qnantnm Algorithm 

Entanglement between registers holding quantum states lies at the heart of 
the quantum algorithm which we shall describe in this section. The formula- 
tion of a problem on which a quantum computer will exceed the performance 
of any classical (probabilistic) computer may appear artificial. However, this 
was one of the first examples of problems on which a quantum computer could 
provably outperform any classical computer, with an exponential speedup. 

Because of its clarity and methodology, we present the quantum algo- 
rithm of Simon, in which many of the basic principles of quantum computing, 
namely the superposition principle, computing with preimages and the use 
of the Fourier transform, become apparent. We briefly remind the reader of 
the problem and mention that we are considering here a slightly generalized 
situation compared with the original setup (see also [57, 205, 206]). 

Quantum algorithms relying on the same principles have been given in [56, 
200]. As in the case of Simon’s problem described below, these algorithms rely 
on the Fourier transform for a suitably chosen abelian group. In both cases 
it has been shown that these quantum algorithms provide a superpolynomial 
gap over any classical probabilistic computer in the number of operations 
necessary to solve the corresponding problems. 

In the following, we denote by Z 2 the elementary abelian 2-group of order 
2", the elements of which we think of as being identified with binary strings 
of length n, and denote addition in Z 2 by 0. 

Definition 4.3 (Simon’s Problem). Let / : Z 2 — > Z 2 be a function given 
as a black-box quantum circuit, i.e. f can be evaluated on superpositions of 
states and is realized by a unitary transform Vf specified by 

|a:)|0) 1 -^ |a;)|/(a;)), for all x G Z 2 , 

as described in Sect. ^.2.3 (see, in particular, (4-3)). In addition, it is specified 
that there is a subgroup U C Z 2 (the “hidden” subgroup) such that f takes a 
constant value on each of the cosets g ®U for g G Ztf and, furthermore, f 
takes different values on different cosets. The problem is to find generators 
for U. 

We can now formulate a quantum algorithm which solves Simon’s problem 
in a polynomial number of operations on a quantum computer. This algorithm 
uses 0(n) evaluations of the black-box quantum circuit /, and the classical 
postcomputation, which is essentially linear algebra over GF{2), and also 
takes a number of operations which is polynomial in n. 

Algorithm 2 This algorithm needs two quantum registers of length n, hold- 
ing elements of the domain and codomain of f , and consists of the following 
steps. 
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1. Prepare the ground state 

|<^i) = |0...0)(g)|0...0) 



in both quantum registers. 

2. Achieve equal amplitude distribution in the first register, for instance by 
an application of a Hadamard transformation to each qubit: 

l‘^2) = i Z! lx) 0 I0...0) . 

3. Apply Vf to compute f in superposition. We obtain 

i Z ■ 

xez" 



4- Measure the second register to obtain some value z in the range of f . 
Owing to the condition on f specified, the first register now holds a coset 
go ® U of the hidden subgroup U , namely the set of elements equal to 
z = f{go): 



\Pi) 



1 

~7W\ 



Z 

f(x)=z 



1 

7W\ 



Z 

xegoBU 



\x)\z) ■ 



5. Application of the Hadamard transformation to the first register 

transforms the coset into the superposition ^°\y) ■ ^'^P~ 

ported vectors of this superposition are the elements of U'^ , which is the 
group defined by := {y G Zif : x ■ y = ~ orthog- 

onal complement of U with respect to the scalar product in Ztf (see also 
Sect. 4-4-2)- 

6. Now measure the first register. We draw from the set of irreducible rep- 
resentations of Ztf having U in the kernel, i.e. we obtain an equal distri- 
bution over the elements of [/■*■ . 

7. By iterating steps 1-6, we produce elements of Zlf which generate the 
group JJ-^ with high probability. After performing this experiment an ex- 
pected number ofn times, we generate with probability greater than 1— 
the group JJ-^. 

8. By solving linear equations over GF(2), it is easy to find generators for 

= U . 



Therefore we obtain generators for U by computing the kernel of a matrix 
over GF(2) in time 0{n^). 
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Analysis of Algorithm 2 

• We first address the measurement in step 4. If we do not perform this 
measurement, we are left with the state 

^ aez^/uxeu 

and if we continue with step 5 we shall obtain the state 
^ o-eZJ/C/yGCZ-L 

Therefore, sampling of the first register as in step 6 will yield an equal 
distribution over [/■*■ and we can go on as in step 7. Hence we can omit 
step 4. 

• The reason for the application of the transformation iJ®" and the ap- 
pearance of the group will be clarified in the following sections. As it 
turns out, is an instance of a Fourier transform for an abelian group 
and T is an antiisomorphism of the lattice of subgroups of Z 2 . 

• For the linear-algebra part in step 8, we refer the reader to standard texts 
such as [207]. Gauss’s algorithm for computing the kernel of an n x n ma- 
trix takes 0{n^) arithmetic operations over the finite field Z 2 = GF{2). 
Overall, we obtain the following cost: 0{n) applications of V/, 0{n^) el- 
ementary quantum operations (which are all Hadamard operations H 2 ), 
0{n^) measurements of individual qubits, and O(n^) classical operations 
(arithmetic in GF(2)). 



4.4 Quantum Fourier Transforms: the Abelian Case 



In this section we recall the definition and basic properties of the discrete 
Fourier transform (DFT) and give examples of its use in quantum computing. 

In most standard texts on signal processing (e.g. [208]), the DFTjv of a 
periodic signal given by a function / : Zjy — > C (where Zjy is the cyclic group 
of order N) is defined as the function F given by 

F{u) := ^ . 



If, equivalently, we adopt the point of view that the signal / and the Fourier 
transform are vectors in C^, we see that performing the DFT at is a matrix 
vector multiplication of / with the unitary matrix 



DFTat := ■ L"^] . . 






where u> = denotes a primitive Ath root of unity. 
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From an algebraic point of view, the DFT ^ gives an isomorphism (p of 
the group algebra CZjv, 

^-.CZn — , 

onto the direct sum of the irreducible matrix representations of Zjy (where 
multiplication is performed pointwise). This means that DFT^v decomposes 
the regular representation of Zjv into its irreducible constituents. It is known 
that this property allows the derivation of a fast convolution algorithm [171] 
in a canonical way and can be generalized to more general group circulants. 
Viewing the DFT as a decomposition matrix for the regular representation 
of a group leads to the generalization of Fourier transforms to arbitrary finite 
groups (cf. [171] and Sect. 4.6.1). 

The fact that DFT at can be computed in 0(iV log iV) arithmetic opera- 
tions (counting additions and multiplications) is very important for applica- 
tions in classical signal processing. This possibility to perform a fast Fourier 
transform justifies the heavy use of the DFT n in today’s computing technol- 
ogy. In the next section we shall show that the 0{N log N) bound, which is 
sharp in the arithmetic complexity model (see Chap. 4 and 5 of [174]) can be 
improved with a quantum computer to 0[(logiV)^] operations. 

4.4.1 Factorization of DFTjv 

From now on we restrict ourselves to cases of DFT^r where = 2" is a power 
of 2, since these transforms naturally fit the tensor structure imposed by the 
qubits. 

The efficient implementation of the Fourier transform on a quantum com- 
puter starts from the well-known Cooley-Tukey decomposition [209]: after the 
row permutation Ur, where r = (1, . . . , n) is the cyclic shift on the qubits, is 
performed, the DFT 2 n has the following block structure [171]: 



Hr DFTan = 



DFT2r.-l 
DFT2r.-i Wn 



DFT 



2n-l 



-DFT2n-l Wn 



= (I 2 0 DFT2n-l) • Tn ■ (DFT 2 0 12"-!) ■ 
Here we denote by 



/I 



Tn := I2— 1 © Wn, Wn : = 



\ 



W2" 






V 
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the matrix of twiddle factors [171]. Taking into account the fact that Wn has 
the tensor decomposition 




we see that T„ cair be implemented by n— 1 gates having one control wire each. 
These can be factored iirto the elemeirtary gates Qi with constant overhead. 

Because tensor products are free iir our computatioiral model, by recursion 
we arrive at air ripper bound of O(n^) for the number of elementary operations 
necessary to compute the discrete Fourier transform on a quantum computer 
(this operation will be referred to as “QFT” ) . 

In Fig. 4.8, the derived decomposition into quantum gates is displayed 
using the graphical notation introduced in Sect. 4.2.1. The gates labeled by 
Dk in this circuit are the diagonal phase shifts diag(l, and, in ad- 
dition, we have used the abbreviation = 2". We observe that the per- 
mutations Un, which arose in the Cooley- Tukey formula, have all been col- 
lected together, yielding the so-called bit reversal, which is the permutation 
of the quantum wires (l,n) (2,n— 1) . . . (n/2,n/2 -|- 1) when n is even and 
(1, n) (2, n— 1) . . . ((n — l)/2, (n -I- 3)/2) when n is odd. 




Fig. 4.8. Quantum circuit that computes a Fourier transform QFT 2 n 



If we intend to use the Fourier transform as a sampling device, i.e. the 
application of QFT 2 n to a state \ip) followed directly by a measurement in 
the standard basis, we can use the structure of this quantum circuit that 
computes a Fourier transform QFT 2 n to avoid nearly all quantum gates [210]. 

In order to give circuits for the QFT for an arbitrary abelian group, we 
need to describe the structure of these groups and their representations first, 
which is done in the next section. 

4.4.2 Abelian Groups and Duality Theorems 

Let A be a finite abelian group. Then A splits into a direct product of its 
p-components: A = Ap^ x ... x Ap^, where pi,i = are the prime 

divisors of the order \A\ of A (see [211], Part I, paragraph 8). 
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Furthermore, each Ap., which consists of all elements annihilated by a 
power of Pi, has the form Ap. = Zp'-i.i x . . . x (see [211], Part I, para- 

graph 8). Both statements follow immediately from the structure theorem for 
finitely generated modules over principle ideal rings. 

We remark that a Fourier transform for a finite abelian group can easily be 
constructed from this knowledge: given two groups G\ and G 2 , the irreducible 
representations of their direct product Gi x G 2 can be obtained from those 
of Gi and G 2 as follows. 

Theorem 4.6. The irreducible representations of G = G\ x G 2 are given by 

Irr(G) = {4>i ®(j)2'-4’i^ Irr(Gi), (/)2 G Iit(G 2 )} . 

(For a proof see [212].) If we therefore encode the elements of A according to 
the direct-product decomposition as above, the matrix 

n ki 

DFT^ = ( 8)0 DFTp.r,,^ 
i=l j=l 

is a decomposition matrix for the regular representation of A. 

Corollary 4.1. Let A be a finite abelian group of order 2". Then a Fourier 
transform for A can be computed in 0{n^) elementary operations. 

Proof. The decomposition of DFT 2 n has already been considered and an im- 
plementation in O(n^) many operations has been derived from the Cooley- 
Tukey formula in Sect. 4.4.1. Since tensor products are free in our computa- 
tional model, we can conclude that a direct factor Z 2 n is already the worst 
case for an implementation. □ 

Example 4-6. The Fourier transform for the elementary abelian 2-group Z 2 
is given by the tensor product DFT 2 0 • • • 0 DFT 2 of the Fourier transform 
for the cyclic factors and hence coincides with the Hadamard matrix 
used in Algorithm 2. 

The Dual Group. Given a finite abelian group A, we can consider Hom(A,C*), 
i.e. the group of characters of A (NB: in the nonabelian case a character is 
generalized to the traces of the representing matrices and hence is not a ho- 
momorphism anymore). The following theorem says that A is isomorphic to 
its group of characters. 

Theorem 4.7. For a finite abelian group A, we have A Hom(A,C*). 

We shall make the isomorphism (f explicit. Choose . . . , Wp„, where 
the uJi are primitive Pith roots of unity in C*. Then is defined by the 
assignment </>(ei) := cup^ on the elements := (0, . . . , 1, . . . , 0), where the 1 
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is in the tth position. The isomorphism cj) is not canonical, since a different 
choice of primitive roots of unity will yield a different isomorphism. 

Next we observe that there is a pairing (3 (i.e. a bilinear map) between A 
and Hom(A,C*) via P ■. {a, f) f (a) £ C* . We suppose that a is orthogonal 
to / (in symbols, a _L /) if /3(a, /) = 1. For a given subgroup U C A we can 
now define (see also Algorithm 2) the orthogonal complement f/-*- := {y £ 
A : P{y,x) = 1, Vx S U}. The following duality theorem holds (see [184]): 

Theorem 4.8. For all subgroups U, U' C A, the following identities hold: 

(a) Self-duality of A: 

{U^)^ = U . 

(b ) Complementarity: 

{UCU')^ = {U^,U'^) . 

(c) The mapping T is an inclusion-reversing antiisomorphism on the lattice 
of subgroups of A (i.e. T is a Galois correspondence). 



4.4.3 Sampling of Fourier Coefficients 



In this section we address the problem of gaining information from the Fourier 
coefficients of a special class of states. More precisely, we consider the Fourier 
transforms of the (normalized) characteristic function 



IXc+c/) := 



y 1^) 



(4.5) 



of a coset c-\-U of a subgroup f/ C A of an abelian group A. The question of 
what conclusions about U can be drawn from measuring the Fourier trans- 
formed state (4.5) is of special interest, as we have already seen in Sect. 4.3. 
The following theorem shows that for an abelian group, the Fourier transform 
maps subgroups to their duals. 

Theorem 4.9. Let DFT^i = {^/ \/\A\)'^^ y^^P{x,y)\y) (x| be the Fourier 
transform for the abelian group A. Then for each subgroup U Q A, we have 



DFT^ 



1 

7W\ 



xeu 



Proof. Since 



dft^ 



1 

7W\ 



y 

xeu 



1 

7W\ 



y \y) ■ 

yeur 



1 

7W\ 




P(x,y)\y) (x| 



1 

V\m\ 



y y P{x,y)\y) , 

yeA xeu 



y 1 ^) 

xeu 
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it suffices to show that J2xeu = 0 for j/ ^ [/-*-, but this statement 

follows from the fact that the existence of xq with P{xo,y) ^ 0 implies that 
= T,xGu(^i^ + ^o,y) = P{xo,y)T,xGu(^i^’y)- The other case, 
HxdU Pi^^y) = 1^1 fo>^ y ^ obvious. □ 

Hence, measuring the Fourier spectrum of \xu) yields an equal distribu- 
tion on the elements of the dual group U-^. Also, in the case of the character- 
istic function of a coset c+U instead of f/, we obtain the same probability 
distribution on [/-*- since the Fourier transform diagonalizes the group ac- 
tion completely in the abelian case, i.e. the translation by c corresponds to a 
pointwise multiplication by phases in the Fourier basis: 



where y)c,y G ^(1) phase factors which depend on c and y but are al- 
ways eth roots of unity, where e denotes the exponent of A. Since making 
measurements involves taking the squares of the amplitudes, we obtain an 
equal distribution over [/-*-. The states \xc+u)^ which in general will be highly 
entangled, make the principle of interference and its use in quantum algo- 
rithms apparent: only those Fourier coefficients remain which correspond to 
the elements of [/■*■ (constructive interference), whereas the amplitudes of all 
other elements vanish (destructive interference). 

4.4.4 Schur’s Lemma and its Applications 
in Quantum Computing 

In this section we explore the underlying reason behind Theorem 4.9, namely 
Schur’s lemma. We present further applications of this powerful tool from 
representation theory. For a proof of Schur’s lemma we refer the reader to 
Sect. 2 of [213]. 

Lemma 4.2 (Schur’s lemma). Let pi : G ^ GLc(I^) and p 2 '■ G ^ 
GLc{W) he irreducible complex representations of a group G. Suppose that 
the element A G End(y, W) has the property 



i.e. A commutes with all pairs of images pi{g),P 2 {g)- Then exactly one of the 
following cases holds: 

(i) Pi ^ p 2 . In this case A = OEnd(v.w)- 

(ii) Pi = P 2 - In this case A is a homothety, i.e. A = A • lEnd(v,w) with an 
element X G C. 

We give two applications of Schur’s lemma which make its importance in 
the context of Fourier analysis apparent. First, we present a reformulation 
of Theorem 4.9 in representation-theoretical terms and state then a theorem 
which is useful in the sampling of functions having a hidden normal subgroup. 




Pi{g) ■ A = A- P2{g),yg g G , 
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Theorem 4.10. Let G be a finite abelian group, Hom(G', C*) the dual group, 
and ft : G X Hom(G', C*) — > C* the canonical pairing defined by ft(g,ip) := 
ip{g). Then for each subgroup U C G, the following holds (normalization 
omitted): 

DFTg E I«) = E 1^) ■ 

uGU (^GHom(G,C*) 

C7CKer((/p) 



Moreover, we obtain the following identity for the cosets uq + U : 

dftg ^ |mo + m) = E Pi'U‘0,‘p) -I t) ■ 

ueU v:)GHom(G,C*) 

UCKei{ip) 

I Gl 

Proof. The mapping DFTg : CG — > C is given by evaluation of elements 

of G for the irreducible representations {tp\, . . . ,Lps} of G, which are all one- 
dimensional (i.e. s = |G|) and hence are characters, since G was assumed to 
be abelian. Therefore, the coefficient for the irreducible representation ipi is 
computed from 

'Pi{uo -h u) = ^ </5i(uo) • ipi{u) 
uGU uGU 

= Ti{uo) ■ ^ . 

UGU 

Considering the restricted characters (pi J, U, we use Schur’s lemma (Lemma 
4.2) to deduce that = 0 iff C/ 2 Ker((/?i). □ 

Theorem 4.11. Let G be an arbitrary finite group and <l G a normal 
subgroup of G. Then for each irreducible representation p of G of degree d, 
exactly one of the following cases holds: 

tE E ^ E , 

' ' neN ' ' neN 

where the first case applies iff N is contained in the kernel of p. 

Proof. Let A := {t /\N\)J2neN Pi''^) denote the equal distribution over all 
images of N under p. Then, from the assumption of normality of N , we 
obtain 

p{g)~^ ■ A- p{g) = A , 

i.e. A commutes with the irreducible representation p. Using Schur’s lemma, 
we conclude that A = A • l^xd- If iV C Ker(p) we find A = l^xd- On the 
other hand, if ^ we conclude that A is the zero matrix O^xd, 
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because otherwise an element ng ^ N equal to p{no) ^ Idxd would lead to 
the contradiction 

p{no) ■ = X! ■ ”) = X! ’ 

neAf n6Af n£N 

since from this we could conclude /c(uo) = Idxd, contrary to the assumption 
no ^ Ker(p). □ 



4.5 Exploring Quantum Algorithms 



4.5.1 Grover’s Algorithm 



We give an outline of Grover’s algorithm for searching an unordered list and 
present an optical implementation using Fourier lenses. 

The search algorithm allows one to find elements in a list of N items 
fulfilling a given predicate in time 0{N^/^). We assume that the predicate / 
is given by a quantum circuit Vf and, as usual in this setting, we count the 
invocations of Vf (the “oracle” ) . It is straightforward to construct from Vf an 
operator Sf which flips the amplitudes of the states that fulfill the predicate, 
i.e. Sf : |x) |x) for all basis states jx). The Grover algorithm relies 

on an averaging method called inversion about average, which is described in 
the following. 

Gonsider the matrix 






D„ := 



-1 + 






-1 + ^y 



(4.6) 



Dn is a circulant matrix [214], i.e. = circc(— 1+(2/2"'), (2/2"), . . . , (2/2")) 
for any choice of a finite group G. To see this, we recall the definition of a 
general group circulant. 



Circc(u) := {Vgi-^-g^)l<i,j<\G\ 

for a fixed ordering G = {gi, . . . , g\Q\} oi the elements of G and for a vector v 
which is labeled by G. To implement (4.6) on a quantum computer we apply 
the circulant for the group Z 2 , making use of the following theorem. 

Theorem 4.12. The Fourier transform DFT^i for a finite abelian group A 
implements a bijection between the set of A-circulant matrices and the set of 
diagonal matrices over C . Explicitly, each circulant G is of the form 

C = DFT]^^ • diag(di, . . . , d„) • DFTyi 

and the vector d = (di, . . . , d„) of diagonal entries is given by d = DFTy^ • c, 
where c is the first row of G. 
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Fig. 4.9. (a) Equal distribution, (b) flip solutions, (c) inversion abont average 



Grover’s original idea was to use correlations to amplify the amplitudes 
of the states fulfilling the predicate, i.e. to correlate - starting from an ini- 
tial distribution, which is chosen to be the equal distribution P{X = i) = 
1 /N, i = 1, . . . ,N - the vector (—1, 1, . . . , 1) with the probability distribution 
obtained by flipping the signs of the states fulfilling the predicate. 

Starting from the equal distribution i7®"|0), the amplification process in 
Grover’s algorithm consists of an iterated application of the operator —DnSf 
0(>/2”) times [215]. In Fig. 4.9, the steps of this procedure are illustrated in 
a qualitative way. 

We present a realization of the Grover algorithm with a diffractive optical 
system. This is not a quantum mechanical realization in the sense of quan- 
tum computing, since such a system does not support a qubit architecture. 
However, the transition matrices are unitary and hence we can consider sim- 
ulations of quantum algorithms via optical devices. These optical setups scale 
linearly with the dimension of the computational Hilbert space rather than 
with the logarithmic growth of a quantum register; nevertheless they have 
some remarkable properties, the best known of which is the ability to perform 
a Fourier transform by a simple application of a cylindrical lens [216]. This 
resembles the classical Fourier transform (corresponding to DFTzjn rather 
than DFTz"), which is correct because of the comments preceding Theorem 
4.12. 

Observe that by starting from this operation, we can easily perform cor- 
relations since this corresponds to a multiplication with a circulant matrix. 
Every circulant matrix can be realized optically using a so-called 4/ setup, 
which corresponds to a factorization of a circulant matrix C into diagonal ma- 
trices and Fourier transforms (following Theorem 4.12): C = DFT“^-T>-DFT, 
where £> is a given diagonal matrix (see also Fig. 4.10). 



incoming 

wavefront 



h 



f 

■* 



diffractive eiement 

G=Tg 




outgoing 

wavefront 



/i * G 





Fig. 4.10. Optical 4/ setup that computes a convolution of h and g 
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Remark 4 . 3. The question of what linear transforms can be performed op- 
tically is equivalent to the question of what matrices can be factored into 
diagonal matrices and Fourier transforms, which correspond to diagonal ma- 
trices and circulant matrices. It has been shown in [217] that for fields 
K ^ {GF'(3), GF(5)}, every square matrix M with entries in K can be 
written as a product of circulant and diagonal matrices with entries in K. 
Furthermore, if M is unitary the circulant and diagonal factors can also be 
chosen to be unitary [217]. 

Remark 4-4- There have been several generalizations and modifications of 
the original Grover algorithm. 

• The case of an unknown number of solutions fulfilling the given predicate 
is analyzed in [218]. 

• The issue of arbitrary initial distributions (instead of an equal distribu- 
tion) is considered in [219]. 

• In [215] it is shown that instead of the diffusion operators almost 
any (except for a set of matrices of measure zero) unitary matrix can be 
used to perform the Grover algorithm. 

• There is a quantum algorithm for the so-called collision problem which 
needs 0{ yN/r) evaluations of a given r-to-one function / to find a pair 
of values which are mapped to the same element [220]. 

• The Grover algorithm has been shown to be optimal, i.e. the problem 
of finding an element in an unordered list takes 0{'/N) operations on a 
quantum computer [200]. 

• Some other problems have been solved by a subroutine call to the Grover 
algorithm, e.g. a problem in communication complexity: the problem of 
deciding whether A, i? C {1, . . . , N} are disjoint or not. Using the Grover 
algorithm, it can be shown that it is sufficient for the two parties, one 
holding A and the other holding B, to communicate 0{VN log N) qubits 
to solve this problem [221]. 

4.5.2 Shor’s Algorithm 

In this section we briefly review Shor’s factorization algorithm and show how 
the Fourier transform comes into play. It is known that factoring a number N 
is easy under the assumption that it is easy to determine the (multiplicative) 
order of an arbitrary element in (ZA^)^ For a proof of this, we refer the 
reader to [222] and to Shor’s original paper [9]. 

Once this reduction has been done, the following observation is the cru- 
cial step for the quantum algorithm. Let y be randomly chosen and let 
gcd(?/, A) = 1. To determine the multiplicative order r of y mod N, con- 
sider the function 



fy{x) := mod N . 
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Clearly fy{x + r) = fy{x), i.e. fy is a periodic function with period r. The 
quantum algorithm to determine this period is as follows: 

Algorithm 3 Let N be given; determine M = 2™ sueh that N'^ < M < 2N‘^. 
This number M will be the length of the Fourier transform to be performed 
in the following. 

1. Randomly ehoose y with gcd(y, A) = 1. 

2. Prepare the state |0) ® |0) in two registers of lengths m and [log 2 N~\ . 

3. Applieation of the Hadamard transform A®™ to the left part of the reg- 
ister results in a superposition of all possible inputs 

M-l 

x—0 

4- We eonstruct a unitary operation which computes the (partial function) 
|a:)|0) 1 -^ \x)\y^ mod N) following Sect. 4-^-3 and 4-^-4- Calculation of 
fy(x) = mod N yields for this superposition (normalization omitted) 



M-l 

|x) 0 \y^ mod N) . 

X—0 

5. Measuring the right part of the register gives a certain value zq. The 
remaining state is the superposition of all x satisfying fy{x) = zq: 

s-l 

|a^)|-Zo) = y]] + kr)\zQ) , where y^° = zq and s = 

V^=zo k=0 




6. Performing a QFTj^,j on the left part of the register leads to 

M-l s-l 

EE g27ri(xo+fer)Z/M|;^|_2^^ . 

1=0 k=0 

7. Finally, a measurement of the left part of the register gives a value Iq. 

Application of this algorithm produces data from which the period r can be 
extracted after classical postprocessing involving Diophantine approximation 
(see Sect. 4.5.2). 

A thorough analysis of this algorithm must take into account the overhead 
for the calculation of the function fy : x ^ y^ mod N . However, after this 
function has been realized as a quantum network once (which can be obtained 
from a classical network for this function in polynomial time), the superpo- 
sition principle applies since all inputs can be processed by one application 

of fy 
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The role played by Fourier transforms in this algorithm is twofold. In 
step 3 it is used to generate a superposition of all inputs from the ground 
state |0). This could have been done by any unitary transformation having 
an all-one vector in the first column. In step 6 we use the QFT^ to extract 
the information about the period r which was hidden in the graph of the 
function fy. 

Remark J^.5. It should be noted that for small numbers, as in the example of 
Fig. 4.11 and 4.12, an optical setup using Fourier lenses (cf. Fig. 4.10) could 
implement Shor’s algorithm. This very example was initially simulated with 
the DigiOpt® system [223]. 



Fig. 4.11. Function graph of f{x) = 2^ mod 187 




An important question is the behavior of the states obtained after step 6 if 
the length M of the Fourier transform QFTj^^ we are using does not coincide 
with the period r of the function which is transformed. Note that this period 
r is exactly what has to be determined, and therefore the length N must 
be chosen appropriately in order to gain enough information about r from 
sampling. As it turns out, the choice M = 2™ according to the condition iV^ < 
M < 2N‘^ yields peaks in the Fourier domain which are sharply concentrated 
around the values IM/r [9]. 

In Fig. 4.11 and 4.12, this effect is illustrated for N = 187 and M = 1024. 
The choice of M does not fulfill the condition < M < 2N^. Nevertheless, 
the characteristic peaks in the Fourier spectrum, which are sharply peaked 
around multiples of the inverse of the order, are apparent. 

Let us now consider this state, which has been oversampled by trans- 
forming it with a Fourier transform of length M followed by a measurement, 
a little more closely. Constructive interference in 
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occur for those basis states |/) for which Ir is close to M. The probability of 
measuring a specific I in this sum can be bounded by 



s-l 

^ \ ^ 27rikrl jM 



> 



/sM 






r/M 



2 \ U — 



1 



k=0 

sin[7rrs/(2M)]p 



sM 



> 



sM I sin[7rr/(2M)] P 
The fractions l/M that we obtain from sampling fulfill the condition 



I 

M 



< 



2M ’ 



for some integer p. Because of the choice M > we obtain the result that 
l/M can be approximated efficiently by continued fractions, as described in 
the following section. This classical postprocessing completes the description 
of Shor’s algorithm for finding the order r of y. To factor N we compute the 
least common divisors of + 1, N) and — 1, N), obtaining nontrivial 
factors of iV if r is even and ^ ±1 mod N. 

A similar method can be applied to the discrete logarithm problem [9]. 
We mention that both the factoring and the discrete logarithm problem can 
be readily recognized as hidden-subgroup problems (see also Sect. 5): in the 
case of factoring, this corresponds to the group U generated by y and we are 
interested in the index [Z : [/], which equals the multiplicative order of y. The 
discrete logarithm problem for GF{q)^ can be considered a hidden-subgroup 
problem for the group G = Z x Z and the function f : G ^ GF{q)^ given 
by f{x,y) —>■ C,^a~y, where is the primitive element and a is the element 
for which we want to compute the logarithm. 

The main difference from the situation in Simon’s algorithm (see Sect. 
4.3) is that in these cases we cannot apply the Fourier transform for the 
parent group G, since we do not know its order a priori. Rather, we have to 
compute larger Fourier transforms; preferably, the length is chosen to be a 
power of 2, to oversample. We then obtain the information from the sampled 
Fourier spectrum by classical postprocessing. 

The basic features of the method of Fourier sampling which we invoked in 
Shor’s factoring algorithm are also incorporated in Kitaev’s algorithm for the 
abelian stabilizer problem [187]. At the very heart of Kitaev’s approach is a 
method to measure the eigenvalues of a unitary operator [/, supposing that 
the corresponding eigenvectors can be prepared. This estimation procedure 
becomes efficient if, besides U, the powers ,i = 0,1,..., can also be 
implemented efficiently [180, 187]. We must also mention that the method of 
Diophantine approximation is crucial for the phase estimation. 

Diophantine Approximation. In the following we briefly review some 
properties of the continued-fraction expansion of a real number. An impor- 
tant property a number can have is to be one of the convergents of such an 
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expansion. More specifically, the following theorem holds, which is important 
in the sampling part of Shor’s algorithm for recovering the exact eigenvalues 
of an operator from the data sampled. 

Theorem 4.13. For each a; S R and each fraction p/q fulfilling 



the following holds: p/q is a convergent of the continued-fraction expansion 
of the real number x. 



A proof of this theorem can be found in standard texts on elementary 
number theory (e.g. [224]). There exists a simple algorithm for producing an 
expansion of a rational number into a continued fraction: 

Algorithm 4 Letx € Q. Define ao ■= [a;J,xi := l/(x— oo),ai := [xiJ,X 2 := 
l/(xi — fli) and so on, until we obtain Xi = Oi for the first time. Then we 
can write x as 



X = Oo + 



Ol 



02 



We mention in addition that Algorithm 4 not only yields optimal approx- 
imations (if applied to elements x € R) but is also very efficient, since in 
principle just a Euclidean algorithm is performed. 



4.5.3 Taxonomy of Quantum Algorithms 

The quantum algorithms which have been discovered by now fall into two 
categories, the principles of which we shall describe in the following. 

Entanglement-Driven Algorithms. Suppose we are given a function / : 
A — > T from a (finite) domain A to a codomain Y . This function does not 
have to be injective; however, for a quantum computer to be able to perform 
/ with respect to a suitably encoded A and Y, the function / has to be 
embedded into a unitary matrix Vf (cf. Sect. 4.2.3). We then can compute 
simultaneously the images of all inputs x S A using |x)|0) |x)|/(x)) for all 

X € A by preparing an equal superposition 1^) ^ register and 

the ground state jO) in the Y register first, and then applying the quantum 
circuit Vf to obtain \^)\fi^))- This entangled state can then be written 

as 

X! ( X! |a^>)l2/>, 

yelm(/) x:f{x)=y 



4 Quantum Algorithms: Applicable Algebra and Quantum Physics 129 

i.e. we obtain a separation of the preimages of /. Measuring the second reg- 
ister leaves us with one of these preimages. 

Pars pro toto, we mention the algorithms of Shor, which have been de- 
scribed in Sect. 4.5.2. In this case the function / is given by f{x) = a® mod N, 
where N is the number to be factored and a is a random element in Z^. 

Superposition-Driven Algorithms. The principle of this class of algo- 
rithms is to amplify to a certain extent the amplitudes of a set of “good” 
states, e.g. states which are specified by a predicate given in the form of a 
quantum circuit, and on the other hand to shrink the amplitudes of the “bad” 
states. 

Pars pro toto, we mention the Grover algorithm for searching an un- 
ordered list. We gave an outline of this algorithm in Sect. 4.5.1 and presented 
an optical implementation using Fourier lenses. 



4.6 Quantum Signal Transforms 

Abelian Fourier transforms have been used extensively in the algorithms in 
the preceding sections. We now consider further classes of unitary transforma- 
tions which admit efficient realizations in a computational model of quantum 
circuits. In Sect. 4.6.1 we introduce generalized Fourier transforms which yield 
unitary transformations parametrized by (nonabelian) finite groups. Classi- 
cally, advanced methods have been developed for the study of fast Fourier 
transforms for solvable groups [171, 174, 225, 226]. While it is an open ques- 
tion whether efficient quantum Fourier transforms exist for all finite solvable 
groups, it is possible to give efficient circuits for special classes (see Sect. 
4.6.1). In Sect. 4.6.2 we consider a class of real orthogonal transformations 
useful in classical signal processing, for which quantum circuits of polyloga- 
rithmic size exist. 

4.6.1 Quantum Fourier Transforms: the General Case 

In Sect. 4.4 we have encountered the special case of the discrete Fourier trans- 
form for abelian groups. This concept can be generalized to arbitrary finite 
groups, which leads to an interesting and well-studied topic for classical com- 
puters. We refer to [171, 174, 225, 226] as representatives of a vast number of 
publications. The reader not familiar with the standard notations concerning 
group representations is referred to these publications and to standard ref- 
erences such as [213, 227]. Following [228], we briefly present the terms and 
notations from representation theory which we are going to use, and recall 
the definition of Fourier transforms. 
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The Wedderburn Decomposition. Let ^ be a regular representation of 
the finite group G. Then a Fourier transform for G is any matrix A that 
decomposes 4> into irreducible representations with the additional property 
that equivalent irreducibles in the corresponding decomposition are equal. 
Regular representations are not unique, since they depend on the ordering 
of the elements of G. Also, note that this definition says nothing about the 
choice of the irreducible representations of G, which in the nonabelian case 
are not unique. 

On the level of algebras, the matrix A is a constructive realization of the 
algebra isomorphism 

CG5i0Md,(C) (4.7) 

i 

of C-algebras, where Md^{C) denotes the full matrix ring of di x di matri- 
ces with coefficients in C. This decomposition is also known as Wedderburn 
decomposition of the group algebra CG. 

We remind the reader of the fact that the decomposition in (4.7) is quite 
familiar from the theory of error-avoiding quantum codes and noiseless sub- 
systems [229, 230, 231, 232]. 

As an example of a Fourier transform, let G = Z„ = (x | a;" = 1) 
be the cyclic group of order n with regular representation 4> = 1e It G, 
T = (x°, . . . , x"“^), and let ojn be a primitive nth root of unity.^ Now 

4'^ = Pi ^ where pi ■. x ^ and A = DFT„ = (1/Vn)[wjf I i,j = 

0 . . . n — 1] is the (unitary) discrete Fourier transform well known from signal 
processing. 

If A is a Fourier transform for the group G, then any fast algorithm 
for the multiplication with A is called a fast Fourier transform for G. Of 
course, the term fast depends on the complexity model chosen. Since we 
are primarily interested in the realization of a fast Fourier transform on a 
quantum computer (QFT), we first have to use the complexity measure k, as 
derived in Sect. 4.2.1. 

Classically, a fast Fourier transform is given by a factorization of the 
decomposition matrix A into a product of sparse matrices^ [171, 174, 225, 
233]. For a solvable group G, this factorization can be obtained recursively 
using the following idea. First, a normal subgroup of prime index (G : N) = p 
is chosen. Using transitivity of induction, (f> = 1 e 1 G is written as (l^; t N) T 
G (note that we have the freedom to choose the transversals appropriately). 
Then 1^; | iV, which again is a regular representation, is decomposed (by 

^ The induction of a representation 0 of a subgroup H < G with transversal 
T = {ti,...,tk) is defined by fr G){g) := [(j>{tigtj^) \ i,j = l...n], where 
(j){x) := 4>{x) for X £ H ox else is the zero matrix of the appropriate size. 

® Note that in general, sparseness of a matrix does not imply low computational 
complexity with respect to the complexity measure k. 
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recursion), yielding a Fourier transform B for N . In the last step, A is derived 
from B using a recursion formula. 

A Decomposition Algorithm. Following [228] , we explain this procedure 
in more detail by first presenting two essential theorems (without proof) 
and then stating the actual algorithm for deriving fast Fourier transforms 
for solvable groups. The special tensor structure of the recursion formula 
mentioned above will allow us to use this algorithm as a starting point to 
obtain fast quantum Fourier transforms in the case where G is a 2-group 
(i. e. |G| is a power of 2). 

First we need Clifford’s theorem, which explains the relationship between 
the irreducible representations of G and those of a normal subgroup N of G. 
Recall that G acts on the representations of N via inner conjugation: given 
a representation p of N and t € G we define p* : n p(tnt~^) for n G N. 

Theorem 4.14 (Clifford’s Theorem). Let N < G be a normal subgroup 
of prime index p with (cyclic) transversal T = denote 

by Xi : 1 1 -^ uip, i = 0, . . . ,p— 1, the p irreducible representations of G arising 
from G/N . Assume p is an irreducible representation of N. Then exactly one 
of the two following cases applies: 

1. p = p* and p has p pairwise inequivalent extensions to G. Iff is one of 
them, then all are given by Xi ■ f, i = 0, ... ,p — 1. 

2. p'^ p*' and p(t G is irreducible. Furthermore, (p |t G) | A = p*' 

and 

{\-{p]TG)f®^^ = p]TG , D = diag(l,u;p,...,u;(P-'))* . 

The following theorem provides the recursion formula and was used earlier 
by Beth [171] to obtain fast Fourier transforms based on the tensor product 
as a parallel-processing model. 

Theorem 4.15. Let N <G be a normal subgroup of prime index p having 
a transversal T = and let (j) be a representation of degree 

d of N. Suppose that A is a matrix decomposing tp into irreducibles, i.e. 
(j)"^ = p = Pi (B ... ® Pk, o,nd that p is an extension of p to G. Then 

p-i 

(</> Tt G)^ = 0A-p, 

i=0 

where Xi : t i— > LUp, i = Q, . . . ,p — 1, are the p irreducible representations of 
G arising from the factor group G/N , 

p-i 

B = {\p® A) ■ D ■ (DFTp 0 Id) , and D = 0p(t)* . 

i=0 

If, in particular, f is a direct sum of irreducibles, then B is a decomposition 
matrix of <p Tr G. 



132 



Thomas Beth and Martin Rotteler 



In the case of a cyclic group G the formula yields exactly the well-known 
Cooley- Tukey decomposition (see also Sect. 4.4.1 and [209]), in which D is 
usually called the twiddle matrix. 

Assume that < G is a normal subgroup of prime index p with Fourier 
transform A and decomposition = p = We can reorder the pi 

such that the first, say fc, pi have an extension to G and the other pi occur 
as sequences pi® p\® ■ ■ - ® pi’’ of inner conjugates (cf. Theorem 4.14; note 
that the irreducibles pi, pf have the same multiplicity since (p is regular). In 
the first case the extension may be calculated by Minkwitz’s formula [234]; 
in the latter case each sequence can be extended hy pi ]t G (Theorem 4.14, 
case 2). We do not state Minkwitz’s formula here, since we shall not need it 
in the special cases treated later on. Altogether, we obtain an extension p of p 
and can apply Theorem 4.15. The remaining task is to ensure that equivalent 
irreducibles in ' ~P equal. For summands of p of the form pj we 

have the result that Xj -p^ and Pj are inequivalent, and hence there is nothing 
to do. For summands of p of the form pi tr G, we conjugate Xj ■ {pi |t G) 
onto Pi It G using Theorem 4.14, case 2. 

Now we are ready to formulate the recursive algorithm for constructing a 
fast Fourier transform for a solvable group G due to Piischel et al. [228]. 

Algorithm 5 Let N<G be a normal subgroup of prime index p with transver- 
sal T = Suppose that (p is a regular representation of N 

with (fast) Fourier transform A, i.e. cp"^ = pi ® ... ® pk , fulfilling pi = pj => 
Pi = Pj. A Fourier transform B of G with respect to the regular representation 
<P(t G can be obtained as follows. 

1. Determine a permutation matrix P that rearranges the pi, i = 1, . . . , fc, 
such that the extensible pi (i.e. those satisfying pi = p\) come first, 
followed by the other representations ordered into sequences of length p 
equivalent to pi, p\, ... , pf’’ ' . (Note that these sequences need to be equal 

to Pi, pI, ..., pI , which is established in the next step.) 

2. Calculate a matrix M which is the identity on the extensibles and conju- 
gates the sequences of length p to make them equal to pi, p\, ..., p\ 

3. Note that A ■ P ■ M is a decomposition matrix for <p, too, and let p = 

pA-P-M ^ PJxtend p to G summand-wise. For the extensible summands use 
Minkwitz’s formula; the sequences pi, p\, . . . , pf’’ ^ can be extended by 
Pilr G. p_i 

4 . Evaluate fj at t and build D = 0 pW- 

i=0 

5. Construct a block-diagonal matrix C with Theorem 4-^4> case 2, conju- 
gating Xi ■ p such that equivalent irreducibles are equal. C is the 

identity on the extended summands. 

Result: 

B={lp®A-P-M)-D- (DFTp 0 l,^,) • G 
is a fast Fourier transform for G. 



(4.8) 
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Fig. 4.13. Coarse quantum circuit visualizing Algorithm 5 for 2-groups 



It is obviously possible to construct fast Fourier transforms on a classical 
computer for any solvable group by recursive use of this algorithm. 

Since we are restricting ourselves to the case of a quantum computer 
consisting of qubits, i.e. two-level systems, we apply Algorithm 5 to obtain 
QFTs for 2-groups (i.e. |G| =2" and thus p = 2). In this case the two tensor 
products occuring in (4.8) fit very well and yield a coarse factorization, as 
shown in Fig. 4.13. The lines in the figure correspond to the qubits as in Sect. 
4.2.1, and a box ranging over more than one line denotes a matrix admitting 
no a priori factorization into a tensor product. 

The remaining problem is the realization of the matrices A, P, M, D, C in 
terms of elementary building blocks as presented in Sect. 4.2.1. At present, 
the realization of these matrices remains a creative process to be performed 
for given (classes of) finite groups. 

In [228] Algorithm 5 is applied to a class of nonabelian 2-groups, namely 
the 2-groups which contain a cyclic normal subgroup of index 2. These have 
been classified (see [235], Sect. 14.9, pp. 90-91), and for n > 3 there are 
exactly the following four isomorphism types: 

1. the dihedral group iD 2 "+i = {x,y\ = 1, = x~^) 

2 . the quaternion group Q 2 ^+i = (x,j/ | x^ = 2 /^ = I 7 x^ = x~^) 

3. the group QP 2 n+i = (x,j/ | x^" = = 1, x^ = x^” 

4. the quasi-dihedral group QZ? 2 "+i = {x,y\ a:^" = = 1, x^ = x^" “^). 

Observe that the extensions 1, 3 and 4 of the cyclic subgroup Z 2 ~ = (x) split, 
i.e. the groups have the structure of a semidirect product of Z 2 " with Z 2 . 
The three isomorphism types correspond to the three different embeddings 
of Z 2 = (2/) into (Z 2 ^)^ — Z 2 X Z 271 - 2 . 

In [228] quantum circuits with polylogarithmic gate complexity are given 
for the Fourier transforms for each of these groups. See also [236, 237] for 
quantum Fourier transforms for nonabelian groups. 



An Example: Wreath Products. In this section we recall the definition 
of wreath products in general (see also [235, 238]) and, as an example, give 
efficient quantum Fourier transforms for a certain family of wreath products. 
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Definition 4.4. Let G he a group and H C Sn he a subgroup of the symmet- 
ric group on n letters. The wreath product GlH of G with H is the set 

{{ip,h) ■. h e H,ip : [1, . . . ,n] ^ G} 
equipped with the multiplication 

{ipi, hi) ■ {ip2, h2) ■■= {if, hih2)i , 

where if is the mapping given hy i (pi{i^^)(p 2 {i) for i G [1, . . . , n]. 

In other words, the wreath product is isomorphic to a semidirect product 
of the so-called base group N := Gx . . .xG, which is the n-fold direct product 
of (independent) copies of G with H, in symbols G I H = N x H, where H 
operates via permutation of the direct factors of N. So we can think of the 
elements as n-tuples of elements from G together with a permutation r, and 
multiplication is done componentwise after a suitable permutation of the first 
n factors: 

(l?l ; • ■ • ; 9n, ‘1~) ' {dl^ • ' • -J 9m ^ ) {9r'{l)9l-j ' • ' -J 9r'{n)9n^ ) * 

In this section we show how to compute a Fourier transform for certain 
wreath products on a quantum computer. We show how the general recursive 
method to obtain fast Fourier transforms on a quantum computer described 
in [228] can be applied directly in the case of wreath products A 1 Z 2 , where 
A is an arbitrary abelian 2-group. The recursion of the algorithm follows the 
chain 

AiZ 2 > Ax Ai> E , 

where the second composition factor is the base group. We first want to 
determine the irreducible representations of G := A ? Z 2 . Let G* be the 
base group of G, i. e. G* = A x A. G* is a normal subgroup of G of index 
2. Denoting by .4 = {xi, ■ ■ ■ , Xk} the set of irreducible representations of A, 
recall that the irreducible representations of G* are given by the set {xi^Xj '■ 
i,j = l,...,k}of pairwise tensor products (e.g. Sect. 5.6 of [212]). 

Since G* <3 G, the group G operates on the representations of G* via inner 
conjugation. Because G is a semidirect product of G* with Z 2 , we can write 
each element g G G as g = (oi, 02 ; r) with a\,a 2 G A and we conclude 

(xi «> X2)® = ® X2')" = (xi ® X2)", 

i.e. only the factor group G/G* = Z 2 operates via permutation of the tensor 
factors. The operation of r is to map Xi®X 2 '-^X 2 ®Xi- 

Therefore, it is easy to determine the inertia groups (see [171, 235] for 
definitions) Tp of a representation p of G*. We have to consider two cases: 

(a) p = Xi ® Xi- Then Tp = G, since permutation of the factors leaves p 

invariant. 
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(b) p= Xi® Xj® ^ j- Here we have Tp = G* . 

The irreducible representations of G* fulfilling (a) extend to representa- 
tions of G, whereas the induction of a representation fulfilling (b) is irre- 
ducible. In this case the restriction of the induced representation to G* is, by 
Clifford theory, equal to the direct sum Xi ® X 2 ® Xi ® Xi- 

Example We consider the special case Wn '■= Z 2 i Z 2 , for which the 
quantum Fourier transforms have an especially appealing form. 

Applying the design principles for Fourier transforms described in this 
section (see also [171, 174, 226, 228, 249]), we obtain the circuits for DFTw„ 
in a straightforward way. Once we have studied the extension/induction be- 
havior of the irreducible representations of G*, the recursive formula 

(I 2 0 DFTg*) • 0 m ■ (DFTz, 0 1|A|2) (4.9) 

teT 

provides a Fourier transform for G. Here ^(t) denotes the extension (as a 
whole) of the regular representation of G* to a representation of G [171, 226, 
228]. In the case of Wn, the transform DFTg* is the Fourier transform for 
Z 2 " and therefore a tensor product of 2n Hadamard matrices. 




Fig. 4.14. The Fourier transform for the wreath product ZJ I Z 2 



The circuits for the case of Wn are shown in Fig. 4.14. Obviously, the 
complexity cost of this circuit is linear in the number of qubits, since the 
conditional gate representing the evaluation at the transversal ®igy^?(t) 
can be realized with 3n Toffoli gates. 

Nonabelian Hidden Subgroups. We adopt the definition of the hidden- 
subgroup problem given in [239] . The history of the hidden subgroup problem 
parallels the history of quantum computing, since the algorithms of Simon [-57] 
and Shor [9] can be formulated in the language of hidden subgroups (see, e.g., 
[240] for this reduction) for certain abelian groups. In [206] exact quantum 
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algorithms (with a running time polynomial in the number of evaluations of 
the given black-box function and the classical postprocessing) are given for 
the hidden-subgroup problem in the abelian case. For the general case we use 
the following definition, which should be compared with Simon’s problem in 
Sect. 4.3. 

Definition 4.5 (The Hidden Subgroup Problem). Let G be a finite 
group and f G ^ R a mapping from G to an arbitrary domain R fulfilling 
the following eonditions: 

(a) The function f is given as a quantum circuit, i.e. f can be evaluated in 
superpositions. 

(b) There exists a subgroup U C G such that f takes a constant value on 
each of the cosets gU for g G G. 

(c) Furthermore, f takes different values on different cosets. 

The problem is to find generators for U . 

We have already seen that Simon’s algorithm and Shor’s algorithms for the 
discrete logarithm and factoring can be seen as instances of abelian hidden- 
subgroup problems. 

The hidden-subgroup problem for nonabelian groups provides a natural 
generalization of these quantum algorithms. Interesting problems can be for- 
mulated as hidden-subgroup problems for nonabelian groups, e.g. the graph 
isomorphism problem, which is equivalent to the problem of deciding whether 
a graph has a nontrivial automorphism group [241]. In this case G is the sym- 
metric group Sn acting on a given graph F with n vertices. To reduce the 
graph automorphism problem to a hidden-subgroup problem for Sn, we en- 
code F into a binary string in R := {0, 1}* and define / to be the mapping 
which assigns to a given permutation a G Sn the graph F°’. Progress in the 
direction of the graph isomorphism problem has been made in [242], but an 
efficient quantum algorithm solving the hidden-subgroup problem for Sn or 
the graph isomorphism problem is still not known. 

In [239], the case of hidden subgroups of dihedral groups is addressed. 
The authors show that it is possible to solve the hidden-subgroup problem 
using only polynomially many queries to the black-box function /. However, 
the classical postprocessing takes exponential time in order to solve a non- 
linear optimization problem. In [243] the hidden-subgroup problem for the 
wreath products Z 2 I Z 2 is solved on a quantum computer, using polyno- 
mially many queries to / and efficient classical postprocessing which takes 
O(n^) steps. The fast quantum Fourier transforms for these groups, which 
have been derived in Sect. 5, have been used in this solution. 

4.6.2 The Discrete Cosine Transform 

In this section we address the problem of computing further signal transforms 
on a quantum computer. More specifically, we give a realization of the discrete 
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cosine transforms (DCTs) of type II on a quantum computer, which is based 
on a well-known reduction to the computation of a discrete Fourier transform 
(DFT) of double length. 

The DCT has numerous applications in classical signal processing, and 
hence, this transform might be useful in quantum computing and quantum 
state engineering as well. For efficient quantum signal transforms, and espe- 
cially for wavelet transforms on quantum computers, see also [236, 244, 245, 
246]. 

The discrete cosine transforms come in different flavors, varying slightly 
in their definitions. In this paper we restrict ourselves to the discrete cosine 
transform of type II (DCTu) as defined below. 

Recall that the DCTn is the N x N matrix defined by (see [247], p. 11) 

where ki = 1 for i = 1 , . . . , — 1 and fco = 1 / 

In [247] the DCTn is shown to be a real and orthogonal (and hence uni- 
tary) transformation. Closely related is the discrete cosine transform DCTni, 
which is defined as the transposed matrix (and hence the inverse) of DCTn. 
Therefore, each efficient quantum circuit for the DCTm yields one for the 
DCTn and vice versa, since the inverse transform is obtained by reading the 
circuit backwards (where each elementary gate is conjugated and transposed) . 

Concerning the applications in classical signal processing, we should men- 
tion the well-known fact that the DCTs (of all families) are asymptotically 
equivalent to the Karhunen-Loeve transform for signals produced by a first- 
order Markov process. The DCTn is also used in the JPEG image compres- 
sion standard [247]. 

For a given (normalized) vector [cc) = (x(0), . . . ,x{N — 1))*, we want to 
compute the matrix vector product DCTn • |x) on a quantum computer effi- 
ciently. Since DCTn is a unitary matrix, it has a factorization into elementary 
quantum gates, and we seek a factorization of polylog arithmic length. Note 
that we restrict ourselves to the case iV = 2" since matrices of this size fit 
naturally into the tensor product structure of the Hilbert space imposed by 
the qubits. 

Theorem 4.16. The discrete cosine transform DCTn(2") of length 2" can 
he computed in 0{n^) steps using one auxiliary qubit. 

Proof. The main idea (following Chap. 4 of [247]) is to reduce the computation 
of DCTn to a computation of a DFT of double length. 

Instead of the input vector |x) of length 2", we consider \y) of length 
2"+\ defined by 

r x{i)/V2 , z = 0,...,2"-l 

^ ' \x(2"+i-i-l)/V2 , i = 2”,...,2”+i-l ■ 
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The input state of the quantum computer considered here, which has a 
register of length n carrying jx) and an extra qubit (which is initialized in 
the ground state), is \ifin) = |0)|x) (written explicitly, this is x(0)|00 . . . 0) + 
• • • + x(2"')|01 . . . 1)), which is transformed by the circuit given in Fig. 4.15 
to yield \y). 



| 0 ) 

lx) 





H2 
















T^rev 













Fig. 4.15. Circuit that prepares the vector \y) 



As usual, H 2 is the Hadamard transformation and Tr^ev is the permutation 
obtained by performing a <7x on each wire. 

Application of DFT 2 n+i := (l/-\/2"+i)[a;2n+i]i,j=o,...,2"+i-i) where W 2 ™+i 
denotes a primitive 2"+^th root of unity, to the vector \y) yields the compo- 
nents 

(DFT2.+1 • 2 /)(m) = Y. , 

^ i—0 



which holds for m = 0, . . . , 2"+^ — 1. 

Note that multiplication with DFT 2 n+i can be performed in 0{n?) steps 
on a quantum computer [9, 248], which is quite contrary to the classical FFT 
algorithm, which requires 0(n2") arithmetic operations. However, the tensor 
product recursion formula [171, 249] is well suited for direct translation into 
efficient quantum circuits [203]. 

Multiplication of the vector component j/(m) by the phase factor 
yields 



m/2 / mi 
‘^2"-+i 1^2'*+! 



■ W, 



m(2"+ 

2"+i > 



m{i+l/2) m(2"+i-i-l/2) 

^271+1 + W2„+i 



which means that this expression is equal to 2 cos [m(i -|- 1/2)7 t/ 2"]. The 
multiplication with these (relative) phase factors corresponds to a diagonal 
matrix T = I 2 0 diag(cj^+^i , m = 0, . . . ,2"— 1), which can be implemented 
by a tensor product T = I 2 0 T„ 0 • • • 0 Ti of local operations, where Ti = 
diag(l, ujll+ 2 ) for * = 1, . . . , n. 

Looking at the state obtained so far, we see that |/9m) has been mapped 
to ](/?'), the lower 2” components of which are given by 






2^-1 






cos [m{i + l/2)7r/2”]x(i) 
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where m = 0, . . . , — 1. Using an elementary property of the cosine, we 

see that the following holds for the components of this vector: 



ip' im) = — m), for m = 1, . . . , 2"' — 1 , 

and, furthermore, ¥^^(2") = 0- Hence we are nearly 

finished, since, except for cleaning up the help register, the x register has 
been transformed according to DCTn. 

Cleaning up the help register can be accomplished by the matrix 



V2 



V2 



A.\ 



1 1 

\/2 

1 1 

\/2 \/2 



V2' 






y/2 

T/i 



V2 -P2 

— — ! 

v/2 \/2/ 



where the permutation matrix tt arranges the columns 0, . . . , 2"+^ in such a 
way that (i, 2"+^— i+1) stand next to each other for i = 1, . . . , 2" — 1. This can 
be achieved by a quantum circuit, used also in [228], where this permutation 
appeared in the reordering of irreducible representations according to the 
action of a dihedral group. The circuit implementing tt is given in Fig. 4.16 
and can be performed in O(n^) operations. Here is the cyclic shift x x+ 
1 mod 2"' on the basis states, which can be implemented in 0{n^) operations 
(see [228], Sect. 3). 





X 1— > —X 


- 3^ 


Pn 






- 


1 



Fig. 4.16. Grouping the matrix entries pairwise via tt 



The matrix obtained after conjugation with tt is a tensor product of the 
form I 2 " 0 Di, where 

up to a multiplication with the block diagonal matrix 






which in turn can be implemented by an (n— l)-fold controlled operation, i.e. 
in 0{n?) elementary gates [172]. 
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Overall, we obtain the circuit for the implementation of a DCTn in 0{n?) 
elementary gates shown in Fig. 4.17 (we have set D 2 := D^^)- □ 

We close this section with the remark that it is possible to derive other 
decompositions of DCTn into a product of elementary quantum gates which 
have the advantage of being in-place, i.e. the overhead of one qubit that the 
realization given in the previous section needs can be saved [250]. 

This factorization of DCTn (2") of length 2" can also be computed in 
O(n^) elementary operations and does not make use of auxiliary registers. 




Fig. 4.17. Complete quantum circuit for DCTn using one auxiliary qubit 



4.7 Quantum Error-Correcting Codes 

4.7.1 Introduction 

The class of quantum error-correcting codes (QECCs) provides a master ex- 
ample that represents multiple aspects of the features of quantum algorithms. 
They are quantum algorithms per se, showing the applicability of the algorith- 
mic concepts described so far to an intrinsically important area of quantum 
computation and quantum information theory, namely, the stabilization of 
quantum states. 

This is achieved by methods which are based on signal-processing tech- 
niques at two levels: 

• the Hilbert space of the quantum system itself 

• the discrete vector space of the underlying combinatorial configurations. 
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The latter structure of finite-geometric spaces gives, furthermore, a deep 
insight into the features of entanglement, as the error-correcting codes known 
from classical coding theory share a particular feature, namely that of gen- 
erating sets of codewords with maximal minimum distance by constructing 
suitable geometric configurations. On the other hand, these configurations 
nicely display the features of maximally entangled quantum states. 

4.7.2 Background 

We follow the presentation in [104], Chap. 13 for the background on the 
classical theory. A classical error-correcting code (ECC) consists of a set of 
binary words 

X = (xo, . . . ,x„_i) 

of length n, where each “bit” Xi € {0, 1} can take a value Xi G GF(2) in the 
finite field of two elements. With this elementary notion, the codewords can be 
treated as vectors of the n-dimensional GF(2) vector space GE(2)", where 
the set of codewords is usually assumed to form a fc-dimensional subspace 
C < GE(2)". This can be obtained canonically as the range im(G) of a GF{2) 
linear mapping G : GF{2)^ — > GF(2)" of the so-called encoder matrix, which 
maps fc-bit messages onto n-bit codewords, adding a redundancy oir = n — k 
bits in the r so-called parity check bits. 

The characteristic parameters of such a linear code C are the rate R= k/n 
and the minimum Hamming weight 

wh := min{wgtH(c) : c S C, c 0} . 

The Hamming weight for a vector x = (xq, . . . ,x„_i) G GF(2)" is defined 
by wgtfj(x) := |supp(x)|, where supp(x) := {z G [0, . . . , n — 1] : Xi 0}. 

It may be noted that the Hamming distance dH(u, v) := ^{i G [0, . . . , n — 
1] : Mi y^ Mi}, which measures the number of bit flips necessary to change v 
into u, is given by dH(u, v) = wgtfj(u — v). Thus, since the code C is assumed 
to be a linear subspace of GF(2)", the equality for the minimum distance 
dn{C) = minu^v dn(u, v) = wh is easily derived. 

In constructing error-correcting codes, besides solving the parametric op- 
timization problem of maximizing the rate of C and its minimum weight, it 
is a challenge to provide an efficient decoding algorithm at the same time. A 
decoder for C can in principle be built as follows: 

Lemma 4.3. The dual code ofC, which is the {n — k)- dimensional sub- 
space o/GA(2)" orthogonal to C with respect to the canonical GF(2) bilinear 
pairing (see also Sect. J^.J^.2), is generated by any matrix FI : GA(2)"“^ ^ 
GF(2)" with im(iJ) = C-^. 
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Obviously C = Ker(iJ*), the kernel of the transpose of H, and G ■ = 0. 

□ 

For the sake of simplicity we assume that each codeword c G C is trans- 
mitted through a binary symmetric channel (BSC) (Fig. 4.18), which is the 
master model of a discrete memoryless channel [251]. A BSC is assumed to 
add the error vector e G GF(2)" independently of the codeword so that a 
noisy vector u = c -|- e is received with probability • pwgt(e)^ where 

q=l-p. 



0 



1 



q = 1 -P , 




q = l-p 



0 



1 



Fig. 4.18. Binary symmetric channel with parameter p 



From the syndrome 

s := u • iJ* = c • + e • iJ* = e • 



we see that the error pattern determines an affine subspace of GF(2)", namely 
a coset of C in GF(2)". In order to allow error correction, the syndrome has 
to be linked to the unknown error vector e uniquely, usually according to 
the maximum-likelihood decoding principle. Obviously, for the BSC this can 
be achieved by finding the unique codeword c that minimizes the Hamming 
distance to u. For combinatorial reasons this is possible if 



wgt(e) < 



wg - 1 
- 2 - ■ 



Thus a code with minimum Hamming weight wh = 2f -|- 1 is said to correct 
t errors per codeword. 



4.7.3 A Classic Code 

An example which today can be considered a classic in the field of science and 
technology is the application and design of (first-order) Reed-Miiller codes 
RM{l,m). This is not merely because they were a decisive piece of discrete 
mathematics in producing the first pictures from the surface of Mars in the 
early 1970s after the landing of Mariner 9 on 19 January 1972. 

The description of their geometric construction and the method of er- 
ror detection/correction of the corresponding wave functions, which we have 
taken from [104], Chap. 13, is quite similar to the concept of quantum error- 
correcting codes. In this section we describe Reed-Miiller codes in a natural 
geometrical setting (cf. [103, 104, 252]). 
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In the first-order Reed-Muller code RM{1, m), the codewords / G 
are GF(2)-linear combinations of class functions of the index-2 subgroups 
H and their cosets. From this notion, a natural transform into the 

orthonormal basis of Walsh-Hadamard functions [253, 254] is given by the 
characters of Z™, i.e. the Hadamard transformation (see also Sect. 4.2.1 
and [255]). By these means, the GF{2) vector 



by replacing an entry 0 in / by an entry 1 in f and an entry 1 in f by 
an entry —1 in F. Modulation into a wavefunction F{t) is then achieved by 
transmitting the step function in the interval [0,2™ — 1] defined by 



Note that for the first Hadamard coefficient Fq of F{t), which is given by 



This provides a beautiful maximum-likelihood decoding device similar to that 
needed for quantum codes. 

We shall illustrate this with the example of the first-order Reed-Miiller 
code RM{1, 3). 

The codewords of the first-order Reed-Miiller code RM{1, m) with m = 3, 
of length 8, can be regarded as incidence vectors of special subsets of points of 
AG(3,2) (see Fig. 4.19). The following table of Boolean functions (see Sect. 
4.2.2) fi{x 3 , X 2 , iCi) = 1 0 of incidence vectors, 

1 = (1,1, 1,1, 1,1, 1,1) corresponding to the whole space , 

fi = ( 1 , 0 , 1 , 0 , 1 , 0 , 1 , 0 ) corresponding to the 2-subspace (*, *, 0 ) , . . 

/2 = ( 1 , 1 , 0, 0, 1 , 1 , 0, 0) corresponding to the 2-subspace (*, 0, *) , ' ' 

fs = (1, 1, 1, 1, 0, 0, 0, 0) corresponding to the 2-subspace (0, *, *) , 

thus defines an (m + 1) x 2™ generator matrix G. Its range is the following 
set of 16 codewords: 



/=(/(«)Uz- €GF(2)2 



is converted into the real (row) vector 




2m_i 

^ ^ -^Binary(z) 1 [z, 2 +l] (0 ■ 
i=0 




the following identity holds: 



Fo = 2- - 2 wgt(/) . 
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P3 = 011 




110 

RM{1,3) interpreted as a 2-flat in AG{3,2) 



Thus the subset S = {Pq, P^, Pq} determines the incidence vector 
(1, 0, 0, 1, 0, 1, 1, 0), which is the codeword fi + f 2 + fs of The 

geometric interpretation of this codeword as an (m — l)-flat in AG{3, 2) is 
shown in Fig. 4.19. 

With respect to G, the signal function F transmitted (see Fig. 4.20) thus 
corresponds to the (m -|- l)-bit input vector (0, 1, 1, 1). 




Fig. 4.20. Transmitted signal corresponding to the codeword fo + fi + f 2 = 
(1, 0, 0, 1, 0, 1, 1, 0) with and without noise 

In the more general case of Reed-Miiller codes RM{1, m) and their modu- 
lated signal functions, noise e{t) is added to the signal function by the channel 
during transmission, so that the receiver will detect only a signal 

= F{t) + e{t) . 

The behavior of the RM{l,m) demodulator-decoder device can be sket- 
ched on the basis of the underlying geometry as follows. 

Transformation of the received signal '0(f) into the orthonormal basis of 
Walsh-Hadamard functions Wu{t) of order m is achieved by computing the 
2™ scalar products 

0(m) = (140 I 0) , 
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where, for u G GF(2)™, Wu{t) is given by 

W^{t) = ^(_l)-Binary(,-l) . , 

i=i 

corresponding to the uth row of the Hadamard matrix ■ 

As the received signal function is represented by a sampling vector xjj = 
F + e, after the Walsh-Hadamard transformation one obtains i/j = %p-H2m = 
FH2m + eiJ 2 "* ■ We estimate a maximum- likelihood signal function as follows. 
From F = F ■ iJ 2 "‘, where 

F{u) = = ^(_l)«-^+/(«)niod2^ 

V V 

we obtain the identity 

|2™ - F{u) \ = 2wgt(u-^ 0 /) , 

since {u-v)^^Qp(^ 2 )^ is the incidence vector of the hyperspace of AG(2, m) 
orthogonal to the vector u G GF(2)™. As the RM{l,m) codes consist of all 
incidence vectors of such hyperspaces or their cosets (i.e. the complement), 
a minimum distance decoder is realized as shown in Fig. 4.21.® 




Fig. 4.21. Minimum-distance decoder for Reed-Miiller codes (cf. [104], Chap. 13) 



Here the maximizer computes 

X G GF(2)™ such that |■^/'(®)| = max \ip(u)\ . 

ueGF(2)'" 

With the overall ± parity given by the sign (^o) = (— the deconverter 
produces a most likely codeword f = £{ip) ■ 1 + x-^. If the enumeration of 
the generating hyperspaces Ui (cf. (4.10)) is chosen suitably, the decoder 
therefore reproduces the (m + l)-bit message n = (e, a:i, . . . , Xm)-, which is a 
maximum- likelihood estimator of the word originally encoded into f = M G. 
The reader is urged to compare this decoding algorithm with the quantum 
decoding algorithm described in Sect. 4.7.4. 

® With the fast Hadamard transform algorithm (see Sect. 4.2.1), this decoder re- 
quires 0(m2"‘) computational steps. Note that this is one of the earliest commu- 
nication applications of generalized FFT algorithms (cf. [171]). 
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4.7.4 Quantum Channels and Codes 

One of the basic features described so far in this contribution has been the use 
of quantum mechanical properties (entanglement, superposition) to speed up 
the solution of classical problems. Most notably, this basic idea of quantum 
computing has been present in the complex of hidden-subgroup problems 
(see Sect. 4.3, 4.5.2 and 5), where the problem was to identity an unknown 
subgroup of a given group out of exponentially many candidates, given a su- 
perposition over a coset of the unknown subgroup. In this section we shall take 
the dual point of view: we construct states which are simultaneous eigenstates 
of a suitably chosen subgroup of a fixed error group and have the additional 
property that an element of an unknown coset of this group - this models an 
error which happens to the states - can be identified and also corrected. 

To construct these states, we rely on the powerful theory of classical ECC 
[184] introduced in the preceding sections. In what follows, we shall introduce 
the class of binary QECCs, the so-called CSS codes, referring to the elaborate 
article [255] by Beth and Grassl. These codes were independently discovered 
by Calderbank and Shor [82] and Steane [62]. 

Quantum Channels. Before describing the construction of these codes, 
we shall loosely describe the similarities and differences between a classical 
binary symmetric channel (BSC) (see Sect. 4.7.2, Fig. 4.18) and a quantum 
channel (QC). 

Much as in the idealized case of a BSC, where vectors c G GF(2)" are 
transmitted, we shall consider the QC to be a carrier of kets \ip) G 7^2" = 
spanned by the basis kets |x), where x G GF(2)^. In this system, so-called 
error operations, generated by local errors as bit-flip errors, and sign-flip 
errors can occur in superposition. Similarly to the case of a BSC, where the 
error group is isomorphic to (GF(2)",0) = ((ei,...,e„) : Ci G {0,1},* = 
1, ... ,n), in QC the error group 

B = (ei igi ... 0 6n : 6i G {id, (Tx, cr^}, * = 1, . . . , nj) , 

generated by the local bit flips and phase flips, represents all possible error 
operators in the quantum channel by the transition diagram described below. 

Initially, the input wavefunction \ip) will interact with the environment 
via \ip) 1-^ IV’) k) through a modulator; within the channel, this waveform 
evolves under the error group according to the channel characteristics, 

me) (4.11) 

error 

whereas, in the “environment”, ancillae tacitly contain probability amplitudes 
for the occurrence of group elements. 

Much as in the BSC, where only errors e G GF(2)" with a given maximal 
weight wgt(e) < t are allowed or assumed, in the QC the sum of the received 
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ket 



Im) = 7lV')k7) 

'yGB 

wgt(7)<t 



(4.12) 



will range only over those group elements 7 which are products of at most 
t local (single-bit) errors. Much in accordance with the classical case, for 
7 = 6 i 0 . . . (g) e„ G we define wgt( 7 ) to be the number of occurrences of 
ax and Cz needed to generate 7 . 

In order to protect quantum states against errors of this kind in a quan- 
tum channel, a so-called Pauli channel, a quantum error-correcting code must 
be constructed to map original quantum states into certain “protected” or- 
thogonal subspaces, so that errors of the type of (4.12) cannot in practice 
damage the original state seriously. For this purpose, the theory of classical 
codes for the BSC can be successfully applied, as we describe below. 

Quantum Codes. The basic principle is to consider encoded quantum states 
which are superpositions of basis vectors belonging to classical codes, e.g. 



|C) := 



1 

7W\ 



cGC 



First we note a surprising fact expressing an old result of error-correcting 
codes directly in the language of quantum theory. 

Lemma 4.4 (MacWilliams). Let C < GF{2)^ be an error- cor reeting code 
with its dual as usual. Let A be the elementary abelian group (GF'(2)"', 0). 
Then DFTyi is the Hadamard transformation (see Sect. 4-4-^) -^^ 2 " := H 2 0 
. . . 0 H 2 (n factors). For each error vector e G GF(2)'^, 



(4.13, 

Proof. This result is due to the identity 

which holds for all subgroups U of an abelian group A (see Sect. 4.4.2 and 
4.4.3). □ 

The lemma says that any bit-flip error applied to the state \C) will give 
a state whose support C'^ is translation invariant, the shift being expressed 
only in the phases of the elements of . Dually, a phase-flip error in \C) will 
occur as a bit-flip error in \C'^). 
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From this, we deduce the following basic coding principle ([255], p. 462): 
given a classical binary linear (n, k) code C of length n and dimension k, the 
states of the related quantum code are given by 



with suitable Wi G GF'(2)" encoding the ith basis ket of the initial state 
to be protected. In addition to the choice of the classical code C, for the 
construction of a binary quantum code a subset W C GF(2)"/C-*- has to be 
given to define the vectors Wi. 

Example 4-8. Let C := {(0, 0, 0), (1, 1, 1)} C GF{2)^ be the dual of the Reed- 
Miiller code TZ = i?M(l,3) shown in Fig. 4.19. Here W := GF(2)^ /C'^ pro- 
vides an appropriate choice. 



Note that the state |0, 0, 0) -I- |1, 1, 1) is the maximally entangled GHZ state. 
We remark that the GHZ state is, up to local unitary transformations, the 
unique maximally entangled state [256]. The “protected” subspace of code 
vectors is, by definition. 



Since C is a one-error-correcting binary code, the quantum code is endowed 
with this property with respect to single bit-flip errors, i.e. it is protected 
against the error operators 

ez = id® id® ax, 62 = id® ax® id, ei = ax ® id ® id £ U{8) . 

Obviously, the subspaces Hq and Hi = eiHo (i = 1,2,3) are mutually or- 
thogonal, so that 




(4.15) 



Wo = ( 0 , 0 , 0 ), wi = ( 1 , 1 , 1 ) , 



for the following encoding: 

| 0 ) ^ = ] 0 , 0 , 0 ) + 11 , 1 , 1 ) , 
| 1 ) ^ = | 0 , 0 , 0 )-] 1 , 1 , 1 ) . 



(4.16) 



^0 = W) = a\ilJx^g)P\ipxu,) I |a]2 + 1/3]2 = 1} . 



3 




is the direct sum of the four orthogonal spaces 



7fo = (|0,0,0) + ll,l,l),lO,0,0)-]l,l,l)) , 

Hi = ( 11 , 0 , 0 ) + 10 , 1 , 1 ),] 1 , 0 , 0 )- 10 , 1 , 1 )) , 

7f2 = (|0,l,0) + ll,0,l),l0,l,0)-]l,0,l)) , 
7f3 = (|0,0,l) + ll,l,0),l0,0,l)-]l,l,0)) . 
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Thus, for any linear combination E of these single bit-flip errors e^, the en- 
coded state IV') = a|V'i«o) + /^IV’iiu) is represented by 



3 

as a direct sum of four orthogonal vectors, each being “proportional” to |V’)- 
So, by a measurement, i.e. a random projection onto any of the spaces TLi, or 
by applying a conditional gate U = diag(e- : i = 0, . . . , 3), the original state 
can be reconstituted, thus correcting up to one bit-flip error. 

But note that if, instead of C, the code TZ = had been selected, this 
code construction could not have been successful, as TZ = i?M(l,3) can 
detect one error but not correct it. It can be seen (see [255], p. 463) that the 
corresponding quantum code inherits this property of detecting one phase- 
error but not being capable of correcting it. 

Motivated by and starting from this example and the properties and prob- 
lems derived from it, we now quote the following theorem, which provides a 
method to obtain quantum codes from classical codes. 

Theorem 4.17 (CSS Codes). Suppose Ci,C 2 C GF(2)" are classical bi- 
nary codes with parameters (ni,ki,di) and (n 2 ,fc 2 ,d 2 ), respectively, fulfilling 
the additional requirement Cf C C 2 . Let W := {wi : f = 1, . . . , [C 2 : Cf]} be 
a system of representatives of the cosets Cf; jC\. Then the set of states 

V ceCi 

forms a quantum code Q which can correct (di — 1)/2 bit errors and (d 2 — l)/2 
phase errors. 

In practice, we have the following corollary in the case of weakly self-dual 
codes C C C-*-, which were shown to be as good asymptotically in [82]: 

Theorem 4.18. Let C be a weakly self-dual binary code with dual distance d. 
Then the corresponding quantum code is capable of correcting up to {d— l)/2 
errors. 

The construction in this theorem can be made more general, as Beth and 
Grassl have shown in [255]; 

Theorem 4.19. Let C-^ be a weakly self-dual binary code. If for 
Mo := {w G GF(2)"/C-^ | dn{C-^,C-^ + 1«) < 0 
the following condition, 



Wwi, Wj : i ^ j ^ Mo n (A4o + {wi - Wj)) = 0 



(4.17) 
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is satisfied, the quantum code can correct t errors, i.e. any error operator 
e G E where the total number of positions exposed to bit flips or sign flips is 
at most t. 

This leads to the following decoding algorithm. 

Algorithm 6 (Quantum Decoding Algorithm) Let \(f) be encoded by a 
QECC according to Theorems The received vector £\4>) will be 

decoded and reconstructed by the following steps: 

• Perform a measurement to determine the bit-flip errors, i.e. project onto 
the code space Tic or one of its orthogonal images Hc+e under a bit-flip 
error e. 

• Correct this bit- flip error (“subtract” e) by applying the corresponding 
tensor product of Ux operators. 

• Perform a Hadamard transformation. 

• Perform a measurement to determine the sign-flip errors which corre- 
sponds to a bit-flip error in the actual bases. 

• Correct this error. 

• Reencode the final state. 

This decoding algorithm is easily understood from the point of view of 
binary codes, which have been designed as general constructions for the BSC 
(see Sect. 4.7.2). The reader should also observe the stunning analogy between 
this quantum decoder and the Green-machine decoder described in Sect. 
4.7.3, Fig. 4.21. 

4.8 Conclusions 

We have presented an introduction to a computational model of quantum 
computers from a computer science point of view. Discrete Fourier transforms 
have been introduced as important subroutines used in several quantum al- 
gorithms. Throughout, unitary transformations which can be implemented 
in terms of elementary gates using a quantum circuit of polylogarithmic size 
have been of special interest; they yield an exponential speedup compared 
with the classical situation in many cases. 

The quantum algorithms presented here exploit the fundamental princi- 
ples of interference, superposition and entanglement that quantum physics 
offers. We have explored these principles in various algorithms, ranging from 
Shor’s algorithms to algorithms for quantum error-correcting codes. 
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5.1 Introduction 



Quantum entanglement is one of the most striking features of the quantum 
formalism [26]. It can be expressed as follows: If two systems interacted in 
the past it is, in general, not possible to assign a single state vector to either 
of the two subsystems [257]. This is what is sometimes called the principle of 
nonseparability. A common example of an entangled state is the singlet state 
[258], 






V2 



(| 01 )-| 10 )). 



(5.1) 



One can see that it cannot be represented as a product of individual vectors 
describing states of subsystems. Historically, entanglement was first recog- 
nized by Einstein, Podolsky and Rosen (EPR) [24] and by Schrodinger [5].^ 
In their famous paper, EPR suggested a description of the world (called “local 
realism”) which assigns an independent and objective reality to the physi- 
cal properties of the well-separated subsystems of a compound system. Then 
EPR applied the criterion of local realism to predictions associated with an 
entangled state to conclude that quantum mechanics was incomplete. The 
EPR criticism was the source of many discussions concerning fundamental 
differences between the quantum and classical descriptions of nature. 

The most significant progress toward the resolution of the EPR problem 
was made by Bell [25], who proved that local realism implies constraints on 
the predictions of spin correlations in the form of inequalities (called Bell’s 
inequalities) which can be violated by quantum mechanical predictions for 
a system in the state (5.1). The latter feature of quantum mechanics, usu- 
ally called nonlocality, is one of the most evident manifestations of quantum 
entanglement. 

Information-theoretical aspects of entanglement were first considered by 
Schrodinger, who wrote [5], in the context of the EPR problem, 

“Thus one disposes provisionally (until the entanglement is resolved by actual obser- 
vation) of only a common description of the two in that space of higher dimension. 

^ In fact, entangled quantum states had been used in investigations of the proper- 
ties of atomic and molecular systems [259]. 
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(c) Springer- Verlag Berlin Heidelberg 2001 



152 Michal Horodecki, Pawel Horodecki and Ryszard Horodecki 

This is the reason that knowledge of the individual systems can decline to the scant- 
iest, even to zero, while that of the combined system remains continually maximal. 
Best possible knowledge of a whole does not include best possible knowledge of its 
parts - and that is what keeps coming back to haunt us” . 

In this way Schrodinger recognized a profoundly nonclassical relation between 
the information which an entangled state gives us about the whole system 
and the information which it gives us about the subsystems. 

The recent development of quantum information theory has shown that 
entanglement can have important practical applications (see, e.g., [1, 2, 3, 
260]). In particular, it turns out that entanglement can be used as a resource 
for communication of quantum states in an astonishing process called quan- 
tum teleportation [261].^ In the latter, a quantum state is transmitted by use 
of a pair of particles in a singlet state (5.1) shared by the sender and receiver 
(usually referred to as Alice and Bob), and two bits of classical communica- 
tion. However, in real conditions, owing to interaction with the environment, 
called decoherence, we encounter mixed states rather than pure ones. These 
mixed states can still possess some residual entanglement. More specifically, 
a mixed state is considered to be entangled if it is not a mixture of prod- 
uct states [263]. In mixed states the quantum correlations are weakened, and 
hence the manifestations of mixed-state entanglement can be very subtle 
[263, 264, 265]. Nevertheless, it appears that it can be used as a resource for 
quantum communication. Such a possibility is due to the discovery of distil- 
lation of entanglement [266]; by manipulation of noisy pairs, involving local 
operations and classical communication, Alice and Bob can obtain singlet 
pairs and apply teleportation. This procedure provides a powerful protection 
of the quantum data transmission against the environment. 

Consequently, the fundamental problem was to investigate the structure 
of mixed-state entanglement, especially in the context of quantum commu- 
nication. These investigations have led to discovery of discontinuity in the 
structure of mixed-state entanglement. It appears that there are at least two 
qualitatively different types of entanglement [71]: free, which is useful for 
quantum communication, and bound, which is a nondistillable, very weak 
and mysterious type of entanglement. 

The present contribution is divided into two main parts. In the first part 
we report results of an investigation of the mathematical structure of entan- 
glement. The main question is: given a mixed state, is it entangled or not? We 
present powerful tools that allow us to obtain the answer in many interesting 
cases. A crucial role is played here by the connection between entanglement 
and the theory of positive maps [267]. In contrast to completely positive maps 
[88], positive maps have not been applied in physics so far. The second part 
is devoted to the application of the entanglement of mixed states to quantum 
communication. Now, the leading question is: given an entangled state, can 
it be distilled? The mathematical tools worked out in the first part allow us 

^ For experimental realizations, see [262]. 



Mixed-State Entanglement and Quantum Communication 



153 



to answer the question. Surprisingly, the answer does not simplify the picture 
but, rather, reveals a new horizon including the basic question: what is the 
role of bound entanglement in nature? 

Since entanglement is a basic ingredient of quantum information theory, 
the scope of application of the research presented here goes far beyond the 
quantum communication problem. The insight into the structure of entangle- 
ment of mixed states can be helpful in many subfields of quantum information 
theory, including quantum computing, quantum cryptography, etc. 

Finally, it must be emphasized that our approach will be basically qual- 
itative. Thus we shall not review here the beautiful work performed in the 
domain of quantifying entanglement [66, 268, 269, 270, 271] (we shall only 
touch on this subject in the second part). Owing to the limited space for 
the present contribution, we shall also restrict our considerations to the en- 
tanglement of bipartite systems, even though a number of results have been 
recently obtained for multipartite systems (see, e.g., [272, 273]). 



5.2 Entanglement of Mixed States: Characterization 

We shall deal with states on the finite-dimensional Hilbert space Hab = 
Ha ®Hb- We shall call the system described by the Hilbert space Hab the 
n®m system, where n and m are the dimensions of the spaces Ha and Hb, 
respectively. An operator g acting on is a state if Tr p = 1 and if it is a 
positive operator, i.e. 

Tr gP>0 (5.2) 

for any projectors P (equivalently, positivity of an operator means that it is 
Hermitian and has nonnegative eigenvalues). 

A state acting on the Hilbert space Hab is called separable^ if it can be 
approximated in the trace norm by states of the form 

k 

Q='^PiQi®Qi, (5.3) 

i=l 

where gi and gt are states on Ha and Hb, respectively. In finite dimensions 
one can use a simpler definition [274] (see also [269]): g is separable if it is of 
the form (5.3) for some k (one can always find a fc < dimTf^^). Note that 
the property of being entangled or not does not change if one subjects the 
state to a product unitary transformation g ^ g' = Ui®U 2 gU\®U\. We 
call the states g and g' equivalent. 

^ The definition of separable states presented here is due to Werner [263], who 
called them classically correlated states. 
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We shall further need the following maximally entangled pure state of the 
d($i d system: 

= ( 5 . 4 ) 

We shall denote the corresponding projector by Pf (the superscript d will 
usually be omitted). Then, for any state g, the quantity F = is 

called the singlet fraction.'^ In general, by maximally entangled states we shall 
mean vectors '0 that are equivalent to 

'll) = Ui® U2i>+ , 

where U\, U 2 are unitary transformations. The most common two-qubit max- 
imally entangled state is the singlet state (5.1). One can define the fully 
entangled fraction of a state g of the d ® d system by 

F{g) =m!ix{ip\g\i{j) , (5.5) 

■tjj 

where the maximum is taken over all maximally entangled vectors of the dOd 
system. 

5.2.1 Pure States 

If is a pure state, i.e. g = \ilj){ip\, then it is easy to check if it is entangled 
or not. Indeed, the above definition implies that it is separable if and only 
\i be. if either of its reduced density matrices is a pure state. 

Thus it suffices to find eigenvalues of either of the reductions. Equivalently, 
one can refer to the Schmidt decomposition [275] of the state. As one knows, 
for any pure state there exist bases {ef}, {ef} in the spaces Ha and Hb 
such that 

k 

^ 0 jef), fc<dim7dAB, (5.6) 

i=l 

where the positive coefficients are called Schmidt coefficients. The state 
is entangled if at least two coefficients do not vanish. One finds that the 
positive eigenvalues of either of the reductions are equal to the squares of 
the Schmidt coefficients. In the next section we shall introduce a series of 
necessary conditions for the separability for mixed states. It turns out that 
all of them are equivalent to separability in the case of pure states [276, 277]. 

^ In fact, the state '(/>+ used in the definition of the singlet fraction is a local 
transformation of the true singlet state. Nevertheless, we shall keep the name 
“singlet fraction” , while using the state which is more convenient for technical 



reasons. 
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5.2.2 Some Necessary Conditions for Separability 
of Mixed States 

A condition that is satisfied by separable states will be called a separability 
criterion. If a separability criterion is violated by the state, the state must 
be entangled. It is important to have strong separability criteria, i.e. those 
that are violated by the largest number of states if possible. 

Since violation of the Bell inequalities is a manifestation of quantum en- 
tanglement, a natural separability criterion is constituted by the Bell inequal- 
ities. In [263] Werner first pointed out that separable states must satisfy all 
possible Bell inequalities.® The common Bell inequalities derived by Clauser, 
Horne, Shimony and Holt (CHSH) are given by [27] 

Tr < 2 , (5.7) 

where the Bell-CHSH observable B is given by 

B = acr ® {b+b')cr + a'cr ® {b — b')(j . (5.8) 

Here a, a' . 6, b' are arbitrary unit vectors in M^, aa = ai are 

the Pauli matrices. For any given set of vectors we have a different inequality. 
In [278] we derived the condition for a two-qubit® state that was equivalent 
to satisfying all the inequalities jointly. This condition has the following form: 

M{g) < 1 , (5.9) 

where M is constructed in the following way. One considers the 3x3 real 
matrix T with entries Tij = Tr gai O crj . Then M is equal to the sum of the 
two greater eigenvalues of the matrix T^T. This condition characterizes states 
violating the most common, and so far the strongest, Bell inequality for two 
qubits (see [279] in this context). While it is interesting from the point of 
view of nonlocality, it appears to be not a very strong separability criterion. 
Indeed, there exists [263] a large class of entangled states that satisfy all 
standard Bell inequalities.^ 

Another approach originated from the observation by Schrodinger [5] that 
an entangled state gives us more information about the total system than 
about subsystems. This gave rise to a series of entropic inequalities of the 
form [277, 280] 

S{gA)<S{g), S{gB)<S{g), (5.10) 

® In [263] Werner also provided a very useful criterion based on the so-called flip 
operator (see Sect. 5.2.4). 

® A qubit is the elementary unit of quantum information and denotes a two-level 
quantum system (i.e. a 2 (g) 2 system) [69]. 

See [264, 265, 283] in the context of more sophisticated nonlocality criteria. 
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where qa = Tr^ g, and similarly for qb- The above inequalities were proved 
[277, 280, 281, 282] to be satisfied by separable states for four different en- 
tropies that are particular cases of the Renyi quantum entropies 
i'a = (1 — a)“^logTr 

50 = logi?(e) , 

51 = -Tr glogg , 

5 2 = - log Tr , 

^oo = - log Hell, (5.11) 

where R{g) denotes the rank of the state g (the number of nonvanishing 
eigenvalues). The above inequalities are useful tools in many cases (as we 
shall see in Sect. 5.3.5, one of them allows us to obtain a bound on the 
possible rank of the bound entangled states); still, however, they are not very 
strong criteria. 

A different approach, presented in [66], is based on local manipulations of 
entanglement (this approach was anticipated in [265]). The main idea is the 
following: a given state is entangled because parties sharing many systems 
(pairs of particles) in this state can produce a smaller number of pairs in a 
highly entangled state (of easily “detectable” entanglement) by local oper- 
ations and classical communication (LOCC). This approach initiated a new 
field in quantum information theory: manipulating entanglement. The sec- 
ond part of this contribution will be devoted to this field. It also initiated the 
subject of the quantification of entanglement. Still, however, the seemingly 
simple qualitative question of whether a given state is entangled or not was 
not solved. 

A breakthrough was achieved by Peres [284], who derived a surprisingly 
simple but very strong criterion. He noted that a separable state remains a 
positive operator if subjected to partial transposition (PT). We will call this 
the positive partial transposition (PPT) criterion. 

To define partial transposition, we shall use the matrix elements of a state 
in some product basis: 



Srr^,nv — 9ml/, n^j, 



9mfi,nu = {m\® {n\g\n) ® \n) , (5.12) 

where the kets with Latin and Greek letters form an orthonormal basis in the 
Hilbert space describing the first and the second system, respectively. Hence 
the partial transposition of g is defined as 

(5.13) 

The form of the operator g"^^ depends on the choice of basis, but its eigenval- 
ues do not. We shall say that a state “is PPT” if > 0; otherwise we shall 
say that the state “is NPT” . The partial transposition is easy to perform in 
matrix notation. The state of the m® n system can be written as 

All Aim 

, (5.14) 

Ami — ^rrt.r 
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with nx n matrices Aij acting on the second (C") space. These matrices are 
defined by their matrix elements = Qiv,jii- Then the partial transpo- 

sition can be realized simply by transposition (denoted by T) of all of these 
matrices, namely 






at 

-^ml -^mm _ 



(5.15) 



Now, for any separable state g, the operator must have still non- 
negative eigenvalues [284]. Indeed, consider a partially transposed separable 
state 



=^PiQi^ ■ (5.16) 

i 

Since the state gi remains positive under transposition, so does the total 
state. 

Note that what distinguishes the Peres criterion from the earlier ones is 
that it is structural. In other words, it does not say that some scalar function 
of a state satisfies some inequality, but it imposes constraints on the structure 
of the operator resulting from PT. Thus the criterion amounts to satisfying 
many inequalities at the same time. In the next section we shall see that there 
is also another crucial feature of the criterion: it involves a transposition that 
is a positive map but is not a completely positive one. This feature, abstracted 
from the Peres criterion, allows us to find an intimate connection between 
entanglement and the theory of positive maps. 

Finally, it should be mentioned that necessary conditions for separability 
have been recently obtained in the infinite-dimensional case [285, 286]. In 
particular, the Peres criterion was expressed in terms of the Wigner repre- 
sentation and applied to Gaussian wave packets [286]. 



5.2.3 Entanglement and Theory of Positive Maps 

To describe the very fruitful connection between entanglement and positive 
maps we shall need mathematical notions such as positive operators, posi- 
tive maps and completely positive maps. In the following section we establish 
these notions. In the subsequent sections we use them to develop the charac- 
terization of the set of separable states. 

Positive and Completely Positive Maps. We start with the following no- 
tation. By Aa and Ab we shall denote the set of operators acting on Ha and 
Hb, respectively. Recall that the set A of operators acting on some Hilbert 
space H constitutes a Hilbert space itself (a so-called Hilbert-Schmidt space) 
with a scalar product {A,B) = Tr A^B. One can consider an orthonormal 
basis of operators in this space given by where |i) is a basis 

in the space H. Since we are dealing with a finite dimension, A is in fact a 
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space of matrices. Hence we shall sometimes denote it by Md, where d is the 
dimension of Ti. 

The space of the linear maps from A a to Ab is denoted by C{Aa-,Ab)- 
We say that a map A € L{AA^ -4b) is positive if it maps positive operators in 
Aa into the set of positive operators, i.e. if H > 0 implies A{A) > 0. Finally, 
we need the definition of a completely positive (CP) map. We say [88] that 
a map A € L{Aa, Ab) is completely positive if the induced map 

= 4 (g) I„ : Aa Mn Ab ® Mn (5-17) 

is positive for all n; here I„ is the identity map on the space Thus 

the tensor product of a CP map and the identity maps positive operators 
into positive ones. An example of a CP map is > W gW\ where W is an 
arbitrary operator. As a matter of fact, the general form of a CP map is 

A(g) = J2w^gW^ . (5.18) 

i 

CP maps that do not increase the trace (Tr A{g) < Tr g) correspond to 
the most general physical operations allowed by quantum mechanics [88]. 
If Tr A{g) = Tr g for any g (we say the map is trace-preserving), then the 
operation can be performed with probability 1; otherwise, it can be performed 
with probability p = Tr A(g). 

It is remarkable that there are positive maps that are not CP : an example 
is just the transposition mentioned in the previous section. Indeed, if g is 
positive, then so is g"^, because 

Tr g'^P = Tr gP^ > 0 (5.19) 

and P"^ is still some projector. We have used here the fact that Tr A"^ = Tr A. 
On the other hand, I (g) T is no longer positive. One can easily check this, 
showing that (I 0 T)P+ = is not a positive operator. 

A positive map is called decomposable [287] if it can be represented in the 
form 

4 = A^p -k A^P o T , (5.20) 

where A^p are some CP maps. For low-dimensional systems (A : M 2 M 2 or 
A : M3 — > M 2 ) the set of positive maps can be easily characterized. Namely, 
it has been shown [288, 289] that all the positive maps are decomposable in 
this case. If, instead, at least one of the spaces is A4„ with n > 4, there exist 
nondecomposable positive maps [287, 289] (see the example in Sect. 5.2.4). 
No full characterization of positive maps has been worked out so far in this 
case. 

® Of course, a completely positive map is also a positive one. 
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Characterization of Separable States via Positive Maps. The fact 
that complete positivity is not equivalent to positivity is crucial for the prob- 
lem of entanglement that we are discussing here. Indeed, trivially, the product 
states are mapped into positive operators by the tensor product of a posi- 
tive map and an identity: {A ($iI){g(Sig) = (Ag) ® g > 0. Of course, the 
same holds for separable states. Then the main idea is that this property of 
separable states is essential, i.e., roughly speaking, if a state g is entangled, 
then there exists a positive map A such that (/I ® I)p is not positive. This 
means that one can seek the entangled states by means of the positive maps. 
Now the point is that not all of the positive maps can help us to determine 
whether a given state is entangled. In fact, the completely positive maps do 
not “feel” entanglement. Thus the problem of characterization of the set of 
the separable states reduces to the following: one should extract from the set 
of all positive maps some essential ones. As we shall see later, this is possible 
in some cases. Namely, it appears that for the 2 0 2 and 2 0 3 systems the 
transposition is the only such map. For higher-dimensional systems, apart 
from transposition, nondecomposable maps will also be relevant. 

Consider the following lemma [267], which will lead us to the basic theo- 
rem relating entanglement and positive maps. 

Lemma 5.1. A state g G Aa ® Ab is separable if and only if 

Tr{Ag) > 0 (5.21) 

for any operator A satisfying Tr(AP <E> Q) > 0, for all pure states P and Q 
acting on Ha o,nd Hb, respectively. 

Remark. Note that operator A, which is positive on product states (i.e. it 
satisfies TrA P 0 Q > 0), is automatically Hermitian. 

The lemma is a reflection of the fact that in real Euclidean space, a convex 
set and a point lying outside it can always be separated by a hyperplane.® 
Here, the convex set is the set of separable states, while the point is the 
entangled state. The hyperplane is determined by the operator A. Though 
this operator is not positive its restriction to product states is still positive. 
Thus, this operator has been called the “entanglement witness” [290], as it 
indicates the entanglement of some state (the first entanglement witness was 
provided in [263]; see Sect. 5.2.4). Now, to pass to positive maps, we shall use 
the isomorphism between entanglement witnesses and positive non-CP maps 
[291]. Note that, if we have any linear operator A G Aa <8 Ab, we can define 
a map 

(fcj A(]i)(jl) ]l) = (f] 0 (k\A\j) 0 \l) , (5.22) 

® For infinite dimensions one must invoke the Hahn-Banach theorem, whose geo- 
metric form is a generalization of this fact. 
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which can be rephrased as follows: 

^A = {l®A)Pl , (5.23) 

where d = dim7i/i. Conversely, given a map, the above formula allows one 
to obtain a corresponding operator. It turns out that this formula also gives 
a one-to-one correspondence between entanglement witnesses and positive 
non-CP maps [291]. By applying this fact, one can prove [267] the following 
theorem: 

Theorem 5.1. Let g act on the Hilbert space 'HA®'hLB- Then g is separable 
if and only if for any positive map A : Ab Aa the operator (I (g) yl) g is 
positive. 

As we mentioned, the relevant positive maps here are the ones that are not 
completely positive. Indeed, for the CP map A we have (1 0 yl)p > 0 for any 
state g, and hence CP maps are of no use here. The above theorem presents, to 
our knowledge, the first application of the theory of positive maps in physics. 
So far, only completely positive maps have been of interest to physicists. 
As we shall see, the theorem has proved fruitful both for mathematics (the 
theory of positive maps) and for physics (the theory of entanglement). 

Operational Characterization of Entanglement 

in Low Dimensions (2 0 2 and 2 0 3 Systems). The first conclusion 
derived from the theorem is an operational characterization of the separable 
states in low dimensions (2 (g) 2 and 2 g) 3). This follows from the previously 
mentioned result that positive maps in low dimensions are decomposable. 
Then the condition (I g yl)£i > 0 reads (I g A^^)g -|- (I g A 2 ^)g"^’^ . Now, 
since g is positive and is CP, the first term is always positive. If g"^’^ 
is positive, then the second term is also positive, and hence their sum is 
a positive operator. Thus, to check whether for all positive maps we have 
(Ig A)g > 0, it suffices to check only transposition. One obtains the following 
[267] (see also [292]): 

Theorem 5.2. A state g of a 2 ® 2 or 2g3 system is separable if and only 
if its partial transposition is a positive operator. 

Remark. Equivalently, one can use the partial transposition with respect to 
the first space. 

The above theorem is an important result, as it allows one to determine 
unambiguously whether a given quantum state ofa2g2ora2g3 system can 
be written as mixture of product states or not. The necessary and sufficient 
condition for separability here is surprisingly simple; hence it has found many 
applications. In particular, it has been applied in the context of broadcasting 
entanglement [293], quantum information flow in quantum copying networks 
[294], disentangling machines [295], imperfect two-qubit gates [296], analysis 
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of the volume of the set of entangled states [297, 298] , decomposition of sepa- 
rable states into minimal ensembles or pseudo-ensembles [299], entanglement 
splitting [300] and analysis of entanglement measures [270, 301, 302]. 

In Sect. 5.3.2 we describe the first application [303]: by use of this theorem 
we show that any entangled two-qubit system can be distilled, and hence is 
useful for quantum communication. 

Higher Dimensions Entangled States 

with Positive Partial Transposition. Since the Stprmer-Woronowicz char- 
acterization of positive maps applies only to low dimensions, it follows that 
for higher dimensions partial transposition will not constitute a necessary and 
sufficient condition for separability. Thus there exist states that are entangled 
but are PPT (see Fig. 5.1). The first explicit examples of an entangled but 
PPT state were provided in [274]. Later on, it became apparent, that the 
mathematical literature concerning nondecomposable maps contains exam- 
ples of matrices that can be treated as prototypes of PPT entangled states 
[292, 304]. 

We shall now describe the method of obtaining such states presented 
in [274], as it has proved to be a fruitful direction in searching for PPT 
entangled states. Section 5.3 will describe the motivation for undertaking 
the very tedious task of this search - the states represent a curious type of 
entanglement, namely bound entanglement. 

To find the desired examples we must find an entangled PPT state. Of 
course, we cannot use the strongest tool so far described, i.e. the PPT crite- 
rion, because we are actually trying to find a state that is PPT. So we must 
derive a criterion that would be stronger in some cases. It appears that the 
range^° of the state can tell us much about its entanglement in some cases. 
This is contained in the following theorem, derived in [274] on the basis of 
the analogous condition for positive maps considered in [289]. 

Theorem 5.3. (Range Criterion). If a state g acting on the space TLab 
is separable, then there exists a family of product vectors ipi (8> (f>i such that 

(a) they span the range of g 

(b) the vectors {^/’i ® <A*}iLi span the range of g"^^ (where * denotes complex 
conjugation in the basis in which partial transposition was performed). 

In particular, any of the vectors ipi 0 4>l belongs to the range of g. 

Now, in [274] there were presented two examples of PPT states violating 
the above criterion. We shall present the example for a 2 0 4 case.^^ The 

The range of an operator A acting on the Hilbert space H is given by R{A) = 
{A{ip) : Ip G H}. If A is a Hermitian operator, then the range is equivalent to the 
support, i.e. the space spanned by its eigenvectors with nonzero eigenvalues. 

This is based on an example concerning positive maps [289]. 
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PPT NPT 




PPT NPT 




Fig. 5.1. Structure of entanglement of mixed states for 2 (g) 2 and 2 (g) 3 systems (a) 
and for higher dimensions (b) 



matrix is written in the standard product basis 



0b 



1 

7b +1 



6000 0 60 0 
0600 0 06 0 
0060 0 00 6 
0006 0 00 0 
0 0 0 0 ^ 0 0 
6000 0 60 0 
0600 0 06 0 
0 0 6 0 0 0 



(5.24) 
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where 0 < 6 < 1. Now, by performing PT as defined by (5.15), we can 
check that remains a positive operator. By a tedious calculation, one 
can check that none of the product vectors belonging to the range of gt, if 
they are partially conjugated (as stated in the theorem), belongs to the range 
of gj’^ . Thus the condition stated in the theorem is drastically violated, and 
hence the state is entangled. As we shall see further, the entanglement is 
masked so subtly that it cannot be distilled at all! 

Range Criterion and Positive Nondecomposable Maps. The separa- 
bility criterion given by the above theorem has been fruitfully applied in the 
search for PPT entangled states [273, 305, 306]. Theorem 5.3 was applied 
in [307] , where a technique of subtraction of product vectors from the range 
of the state was used to obtain the best separable approximation (BSA) of 
the state. As a tool, the authors considered subspaces containing no product 
vectors. Note that the (normalized) projector onto such a subspace must be 
entangled, as condition (a) of the theorem is not satisfied. This approach 
was successfully applied in [273] (see also [290, 305]) and, in connection with 
the seemingly completely different concept of unextendible product bases, pro- 
duced an elegant, and so far the most transparent, method of construction 
of PPT entangled states. 

To describe the construction,^^ one needs the following definition [273]; 
Definition 5.1. A set of product orthogonal vectors in TLab that 

(a) has fewer elements than the dimension of the space 

(b) is such that there does not exist any product vector orthogonal to all of 

them 



is called an unextendible product basis (UPB). 



Here we recall an example of such basis in the 3 0 3 system: 



ko) 

ki) 



^|0)|0-l), |^2) = ^|2)|1-2) , 

^|0-1)|2), |^3) = ^|1-2)|0) , 

1^4) = — [0 -|- 1 -|- 2)|0 -|- 1 -|- 2) . 



(5.25) 



Of course, the above five vectors are orthogonal to each other. However, each 
of the two subsets {|uo), |ui), |u 4 )} and {|u 2 ), \vs), |u 4 )} spans the full three- 
dimensional space. This prevents the existence of a sixth product vector that 
would be orthogonal to all five of them. How do we connect this with the 
problem we are dealing with in this section? The answer is: via the subspace 
complementary to the one spanned by these vectors. Indeed, suppose that 



12 



The construction applies to the multipartite case [273]; in the present review we 
consider only bipartite systems. 
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{wi = 4>i® is a UPB. For b, d® d system, consider the projector P = 

12^=1 \wi){wi\ onto the subspace H spanned by the vectors Wi (dimTi = k). 
Now, consider the state uniformly distributed on its orthogonal complement 
H± (dim7fj_ = d? — k), 

(5.26) 

The range of the state (7ij_) contains no product vectors: otherwise one would 
be able to extend the product basis {wi}. Then, by Theorem 5.3, the state 
must be entangled. Let us now calculate . Since Wi = (pi ® ipi, then 
= \wi){wi\, where Wi = pi ® ip* . The vectors Wi are orthogonal 
to each other, so that the operator is a projector. Conse- 

quently, the operator (I — P)"’’® = I — P"”"® is also a projector, and hence it 
is positive. We conclude that g is PPT. 

A different way of obtaining examples of PPT entangled states can be 
inferred from the papers devoted to the search for nondecomposable positive 
maps in the mathematical literature [292, 304] (see Sect. 5.2.3). A way to 
find a nondecomposable map is the following. One constructs some map A 
and proves somehow that it is positive. Thus one can guess some (possibly 
unnormalized) state g that is PPT. Now, if (I O A)gi is not positive, then 
A cannot be decomposable, as shown in the discussion preceding Theorem 
5.2. At the same time, the state must be entangled. In Sect. 5.2.4 we present 
an example of a PPT entangled state (based on [288]) found in this way. 
Thanks to its symmetric form, the state allowed researchers to reveal the 
first quantum effect produced by bound entanglement (see Sect. 24). 

Thus a possible direction for exploring the “PPT region” of entanglement 
is to develop the description of nondecomposable maps. However, it appears 
that there can be also a “back-reaction” : exploration of the PPT region may 
allow us to obtain new results on nondecomposable maps. It turns out that 
the UPB method described above allows for easy construction of new nonde- 
composable maps [308]. We direct the interested reader to the original article, 
as well as [290]. We note only that to find a nondecomposable map, one needs 
only to construct some UPB. Then the procedure is automatic, like the pro- 
cedure described above. To our knowledge, this is the first systematic way of 
finding nondecomposable maps. 

5.2.4 Examples 

We present here a couple of examples, illustrating the results contained in 
previous sections. In particular, we introduce two families of states that play 
important roles in the problem of distillation of entanglement. 

Reduction Criterion for Separability. As mentioned in Sect. 8, if A is a 
positive map, then for separable states we have 

(I ® A)g > 0 . 



(5.27) 
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If the map is not CP, then this condition is not trivial, i.e. for some states 
(I 0 A)q is not positive. Consider the map given by A{A) = (Tr ri)I — A. 
The eigenvalues of the resulting operator A{A) are given by Ai = Tr ri. — Oi, 
where are the eigenvalues of A. If A is positive, then at > 0. Now, since 
Tr A = flzj then Xi are also nonnegative. Thus the map is positive. Now, 
the formula (5.27) and the dual formula (ri0l)p > 0 applied to this particular 
map imply that separable states must satisfy the following inequalities: 

I0 0B — £<>0, p>0. (5.28) 

The two conditions, taken jointly, are called the reduction criterion [281, 309]. 
One can check that it implies the entropic inequalities (hence it is better in 
“detecting” entanglement). From the reduction criterion, it follows that states 
g of a d<S> d system with J-{g) > 1/d must be entangled (this was originally 
argued in [66]). Indeed, from the above inequalities, it follows that for a 
separable state a and a maximally entangled state V'me one has ('i/’mejo'A 0 
I — o’l'i/'me) > 0. Since the reduced density matrix of the state is 
proportional to the identity, we obtain {'ipme\o'A 0 IjV’me) = Ar{g'^^‘ a a) = 
1/d. Hence we obtain T < 1/d. We conclude that the latter condition is a 
separability criterion. 

Let us note finally [281], that for 2 0 2 and 2 0 3 systems the reduc- 
tion criterion is equivalent to the PPT criterion, and hence is equivalent to 
separability. 

Strong Separability Criteria from an Entanglement Witness. Con- 
sider the unitary fUg operator P on a d®d system defined by Vip®!/ = 4’®/’- 
Note that it can be written as P = Ps ~ Pa, where Ps and Pa are projectors 
onto the symmetric and antisymmetric subspaces, respectively, of the total 
space. Hence P is a dichotomic observable (with eigenvalues ±1). One can 
check that Tr PH 0 P = Tr AB for any operators H, B. Then P is an en- 
tanglement witness, so that Tr gV > 0 is a separability criterion [263] . Now, 
let us find the corresponding positive map via the formula (5.23). One easily 
finds that it is a transposition (up to an irrelevant factor). Remarkably, in 
this way, given an entanglement witness, one can find the corresponding map, 
so as to obtain the much stronger criterion given by (5.27). 

Werner States. In [263] Werner considered states that do not change if 
both subsystems are subjected to the same unitary transformation: 

g = U ® U gU^ ® for any unitary U . (5.29) 

He showed that such states (called Werner states) must be of the following 
form: 



1 

d?- /3d 



Qw{d) 



(I + /3P), -!</3<l. 



(5.30) 
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where V is the flip operator deflned above. Another form for gyy is [310] 

gw(d) =p^ + (l-p)^ , 0<p<l, (5.31) 

where Na = {(P — d)/ 2 and Ns = {d? + d)/2 are the dimensions of the 
antisymmetric and symmetric subspaces, respectively. It was shown [263] 
that gw is entangled if and only if Tr V qam < 0. Equivalent conditions are 
/3 < — l/d, p > 0 or g is NPT. Thus gw is separable if and only if it is PPT. 
For d=2 (the two-qubit case) the state can be written as (see [264]) 

gw(2) =p 1V’-)(V'-| + (1 -p)^ , -i<p<l. (5.32) 

Note that any state g, if subjected to a random transformation of the 
form U (we call such an operation U twirling), becomes a Werner 
state: 



y dC/C/®[7gC/i(g)17i = gw ■ (5.33) 

Moreover, Tr gV = Tr gwl^ (i-e. Tr gV is invariant under U twirling). 

Isotropic State. If we apply a local unitary transformation to the state 
(5.32), changing ip- into ip^, we can generalize its form to higher dimensions 
as follows [281]: 

g(p, d) = pP+ + (1 - P)^ , where - ^ < P < 1 • (5.34) 

The state will be called “isotropic” [311] here.^^ For p > 0 it is interpreted as 
a mixture of a maximally entangled state P+ with a completely chaotic noise 
represented by 1/d. It was shown that it is the only state invariant under 
U <S) U* transformations.^^ If we use the singlet fraction P = Tr gP+ as a 
parameter, we obtain 

((1 + ’ 0 < ^ < 1 • ( 5 - 35 ) 

The two parameters are related via p = {d^F — l)/(d^ ~ !)• The state is 
entangled if and only if P > 1/d or, equivalently, if it is NPT. Similarly to 
the case for Werner states, a state subjected to U ® U* twirling (random 
U ®U* operations) becomes isotropic, and the parameter P(g) is invariant 
under this operation. 

In [281] it was called a “noisy singlet”. 

The star denotes complex conjngation. 
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A Two-Qubit State. Consider the following two-qubit state: 
g = p\ip-){'ip-\ + (1 -p)|00)(00| . 



(5.36) 



From (5.9) we obtain the result that for p < l/v^, the CHSH-Bell inequal- 
ities are satisfied. A little bit stronger is the criterion involving the fully 
entangled fraction; we have IT < 1/2 for p < 1/2. The entropic inequalities, 
apart from the one involving Sq, are equivalent to each other for this state 
and give again p < 1/2. Thus they reveal entanglement for p > 1/2. By 
applying the partial transposition one can convince oneself that the state is 
entangled for all p > 0 (for p = 0 it is manifestly separable). 

Entangled PPT State Via Nondecomposable Positive Map. Con- 
sider the following state (constructed on the basis of Stprmer matrices [288]) 
of a 3 0 3 system: 



Using the formulas (5.15), one easily finds that for a < 4 the state is PPT. 
Consider now the following map [287] : 



On Oi2 Oi3 






Oil — Ol2 — Oi3 




033 0 0 


021 022 023 




= 


— 021 022 —023 


+ 


0 Oil 0 


031 032 033_ 






— 031 — O 32 033 




0 0 022 



This map has been shown to be positive [287]. Now, one can calculate the 
operator (I (g) d) p and find that one of its eigenvalues is negative for a > 3 
(explicitly, A_ = (3 — a)/2). This implies that 

• the state is entangled (for a separable state we would have (I® 7l)p > 0) 

• the map is nondecomposable (for a decomposable map and PPT state 
we also would have (I ® yl)p > 0). 

For 2 < a < 3 it is separable, as it can be written as a mixture of other 
separable states CTq, = 6/7pi -I- (a — 2)/7cr+ -I- (3 — a)/7cr_, where pi = 
(|'0+)('i/+| + (7+ + (T_)/3. The latter state can be written as an integral over 
product states: 



where \ip{9)) = l/'s/SdO) -|-e‘®|l) -l-e ^‘®]2)) (there exists also a finite decom- 
position exploiting phases of roots of unity [312]). 




(5.37) 



where 



a+ = i(|0)|l)(0|(l| + |l)|2)(ll(2l + l2)l0)(2](0|), 
a_ = i(|l)|0)(l|(0l + |2)ll)(2l(ll + l0)|2)(0|(2]). 



(5.38) 
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5.2.5 Volumes of Entangled and Separable States 



The question of the volume of the set of separable or entangled states in 
the set of all states, raised in [297], is important for several reasons. First, 
one might be interested in the following basic question: is the world more 
classical or more quantum? Second, the size of the volume would reflect a 
consideration that is important for the numerical analysis of entanglement, 
that of to what extent separable or entangled states are typical. Later it 
became apparent that considerations of the volume of separable states led to 
important results concerning the question of the relevance of entanglement 
in quantum computing [313]. 

We shall mainly consider a qualitative question: is the volume of sepa- 
rable (14), entangled (14) or PPT entangled (Vpe) states nonzero? All these 
problems can be solved by the same method [297]: one picks a suitable state 
from either of the sets and tries to show that some (perhaps small) ball round 
the state is still contained in the set. 

For separable states one takes the ball round the maximally mixed state: 
one needs a number po such that for any state g the state 



g = pI/N + {l-p)g 



(5.40) 



is separable for all p < po (here N is the dimension of the total system). 
In [297] it was shown that, in the general case of multipartite systems of 
any finite dimension, such a po exists. Note that in fact we have obtained 
a sufficient condition for separability: if the eigenvalues of a given state do 
not differ too much from the uniform spectrum of the maximally mixed state, 
then the state must be separable. One would like to have some concrete values 
of Po that satisfy the condition (the larger po is, the stronger the condition). 

Consider, for example, the 2 ® 2 system. Here one can provide the largest 
possible Po, as there exists a necessary and sufficient condition for separability 
(the PPT criterion). Consider the eigenvalues Xi of the partial transposition 
of the state (5.40). They are of the form Ai = (1 — p)/N + pXi, where Xi 
are the eigenvalues of (in our case N = 4). One easily can see (on the 
basis of the Schmidt decomposition) that a partial transposition of a pure 
state cannot have eigenvalues smaller than —1/2. Hence the same is true for 
mixed states. In conclusion, we obtain the result that if (1 — p) /N — p/2 > 0 
then the eigenvalues Xi are nonnegative for arbitrary g. Thus for the 2 O 2 
system one can take po = 1/3 to obtain a sufficient condition for separability. 
Concrete values of po for the case of n-partite systems of dimension d were 
obtained in [270]: 



1 



Po = 



(1 + 2/dy 



(5.41) 



These considerations proved to be crucial for the analysis of the experimental 
implementation of quantum algorithms in high-temperature systems via nu- 
clear magnetic resonance (NMR) methods. This is because the generic state 
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used in this approach is the maximally mixed state with a small admixture of 
some pure entangled state. In [313] the sufficient conditions of the above kind 
were further developed, and it was concluded that in all of the NMR quantum 
computing experiments performed so far, the admixture of the pure state was 
too small. Thus the total state used in these experiments was separable: it 
satisfied a condition sufficient for separability. This raised an interesting dis- 
cussion as to what extent entanglement is necessary for quantum computing 
[314, 315] (see also [316]). Even though there is still no general answer, it was 
shown [314] that the Shor algorithm [9] requires entanglement. 

Let us now turn back to the question of the volumes of 14 and Epe. If one 
takes "0+ of a d ® d system for simplicity, it is easy to see that a not very 
large admixture of any state will ensure E > 1 /d. Thus any state belonging 
to the neighborhood must be entangled. Showing that the volume of PPT 
entangled states is nonzero is a bit more involved [297]. 

In conclusion, all three types of states are not atypical in the set of all 
states of a given system. However, it appears that the ratio of the volume of 
the set of PPT states Pppx (and hence also separable states) to the volume 
of the total set of states goes down exponentially with the dimension of the 
system (see Fig. 5.2). This result was obtained numerically [297] and still 




Fig. 5.2. The ratio Pjv = Vppt/P of the volume of PPT states to the volume of 
the set of all states versus the dimension N of the total system. Different symbols 
distinguish different sizes of one subsystem (fc = 2 (o), fc = 3 (a)). (This figure is 
reproduced from Phys. Rev. A 58, 883 (1998) by permission of the authors) 
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awaits analytical proof. However, it is compatible with the rigorous result 
in [318] that in the infinite-dimensional case the set of separable states is 
nowhere dense in the total set of states. Then the generic infinite-dimensional 
state is entangled. 



5.3 Mixed-State Entanglement as a Resonrce 
for Quantum Communication 

As one knows, if two distant observers (one usually calls them Alice and 
Bob) share a pair of particles in a singlet state '0_ then they can send a 
quantum state to one another by the use of only two additional classical bits. 
This is called quantum teleportation [261].^® If the classical communication 
is free of charge (since it is much cheaper than communication of quantum 
bits), one can say that a singlet pair is a resource equivalent to sending one 
qubit. In the following, it will be shown that mixed-state entanglement can 
also be a resource for quantum communication. Quantum communication 
via mixed entangled states will require, apart from teleportation, an action 
called distillation. It will be also shown that there exists a peculiar type of 
entanglement (bound entanglement) that is a surprisingly weak resource. 

5.3.1 Distillation of Entanglement: 

Counterfactual Error Correction 

Now we will attempt to describe the ingenious concept of distillation of en- 
tanglement introduced in [266] and developed in [66, 319] (see also [320]). 
To this end, let us first briefly describe the idea of classical and quantum 
communication via a noisy channel. As is known [321], the central idea of 
classical information theory, pioneered by Shannon, is that one can send in- 
formation reliably and with a nonzero rate via a noisy information channel. 
This is achieved by coding: the k input bits of information are encoded into a 
larger number of n bits. Such a package is sent down the noisy channel. Then 
the receiver performs a decoding transformation, recovering the k input bits 
with asymptotically (in the limit of large n and k) perfect fidelity. Moreover, 
the asymptotic rate of information transmission k/n is nonzero. 

In quantum domain, one would like to communicate quantum states in- 
stead of classical messages. It appears that an analogous scheme can be ap- 
plied here [62, 322]. The k input qubits of quantum information are supple- 
mented with additional qubits in some standard initial state, and the total 
system of n qubits is subjected to some quantum transformation. Now the 
package can be sent down the channel. After the decoding operation, the 

The result could depend on the measure of the volume chosen [317]. In [298] two 

different measures were compared and produced similar results. 

It is called “entanglement-assisted teleportation” in Chap. 2 of this book. 
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state of k qubits is recovered with asymptotically perfect fidelity [69] (now it 
is quantum fidelity - characterizing how close the output state is to the input 
state), regardless of the particular form of the state. The discovery of the 
above possibility (called quantum error correction; we shall call it here direct 
error correction) initiated, in particular, extensive studies of quantum error- 
correcting codes (see [255] and references therein), as well as studies of the 
capacities of quantum channels (see [94] and references therein). A common 
example of a quantum channel is the one-qubit quantum depolarizing chan- 
nel: here an input state is undisturbed with probability p and subjected to 
a random unitary transformation with probability 1 — p. It can be described 
by the following completely positive map: 

A{q)=pq+{\-p)^, (5.42) 

where 1/2 is the maximally mixed state of one qubit. This channel has been 
thoroughly investigated [66, 266, 323, 324]. What is important here is that it 
has been shown [324] that for p < 2/3 the above method of error correction 
does not work. In the classical domain, it would mean that the channel was 
useless. Here, surprisingly, there is a trick that allows one to beat this limit, 
even down to p = 1/3! The scheme that realizes this fact is quite mysterious. 
In direct error correction we deal directly with the systems carrying the infor- 
mation to be protected. Now, it appears that by using entanglement, one can 
remove the results of the action of noise without even having the information 
to be sent. Therefore, it can be called counterfactual error correction. 

How does this work? The idea itself is not complicated. Instead of sending 
the qubits of information, Alice (the sender) sends Bob particles from entan- 
gled pairs (in the state tp-)-> keeping one particle from each pair. The pairs 
are disturbed by the action of the channel, so that their state turns into a 
mixture^® that still possesses some residual entanglement. Now, it turns out 
that by local quantum operations (including collective actions over all mem- 
bers of the pairs in each lab) and classical communication (local operations 
and classical communication, LOCC) between Alice and Bob, Alice and Bob 
are able to obtain a smaller number of pairs in a nearly maximally entan- 
gled state (see Fig. 5.3). Such a procedure, proposed in [266], is called 
distillation. As in the case of direct error correction, one can achieve a finite 
asymptotic rate k/n for the distilled pairs per input pair, and the fidelity, 
which now denotes the similarity of the distilled pairs to a product of sin- 
glet pairs, is asymptotically perfect. Now, the distilled pairs can be used for 
teleportation of quantum information. The maximal possible rate achievable 
within the above framework is called the entanglement of distillation of the 

The capacity Q of a quantum channel is the greatest ratio k/n for reliable trans- 
mission down the given channel. 

If the channel is memoryless, the mixture factorizes into states q of individual 

pairs. 
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state Q, and is denoted by D{q). Thus, if Alice and Bob share n pairs each 
in state g, they can faithfully teleport k = nD(g) qubits. 

As we have mentioned, the error correction stage and the transmission 
stage are separated in time here; the error correction can be performed even 
before the information to be sent was produced. Using the terminology of 

[325] , one can say that Alice and Bob operate on potentialities (an entangled 
pair represents a potential communication) and correct the potential error, 
so that when the actual information is coming, it can be teleported without 
any additional action. 

The above scheme is not only mysterious. It is also much more powerful 
than the direct method. In the next section we describe a distillation protocol 
that allows one to send quantum information reliably via a channel with 
p > 1/3. A general question is: where are the limits of distillation? As we 
have seen, the basic action refers to mixed bipartite states, so that instead of 
talking about channels, we can concentrate on bipartite states. The question 
can be formulated as follows: which states g can be distilled by the most 
general LOCC actions? Here, by saying that a state g can be distilled, we 
mean that Alice and Bob can obtain singlets from the initial state of n 
pairs (thus we shall work with memoryless channels) . 

One can easily see that separable states cannot be distilled: they contain 
no entanglement, so it is impossible to convert them into entangled states by 
LOCC operations. Then the final form of our question is: can all entangled 
states be distilled? Before the answer to this question was provided, the 
default was “yes”, and the problem was how to prove it. Now, we know that 
the answer is “no”, so that the structure of the entanglement of bipartite 
states is much more puzzling than one might have suspected. 

Finally, we should mention that for pure states the problem of conversion 
into singlet pairs has been solved. Here there is no surprise: all entangled pure 
states can be distilled [326] (see also [327]). What is especially important 
is that this distillation can be performed reversibly: from the singlet pairs 
obtained, we can recover (asymptotically) the same number of input pairs 

[326] . As we shall see, this is not the case for mixed states. 






Alice and Bob 
operations 



n pairs, 
each in 
state Q 




•VvWvVvV* 



pairs 



Fig. 5.3. Distillation of mixed-state entanglement 
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5.3.2 Distillation of Two-Qubit States 

In this section, we shall describe what was historically the first distillation 
protocol for two-qubit states, devised by Bennett, Brassard, Popescu, Schu- 
macher, Smolin and Wootters (BBPSSW) [266]. Then we shall show that a 
more general protocol can distill any entangled two-qubit state. 

BBPSSW Distillation Protocol. The BBPSSW distillation protocol still 
remains the most transparent example of distillation. It works for two-qubit 
states g with a fully entangled fraction satisfying T > 1/2. Such states are 
equivalent to those with F > 1/2, so that we can restrict ourselves to the 
latter states. Hence we assume that Alice and Bob initially share a huge 
number of pairs, each in the same state g with F > 1/2, so that the total 
state is Now they aim to obtain a smaller number of pairs with a higher 
singlet fraction F. To this end they iterate the following steps: 

1. They take two pairs and apply U ® U* twirling to each of them, i.e. 
a random unitary transformation of the form U ® U* (Alice picks at 
random a transformation [/, applies it, and communicates to Bob which 
transformation she has chosen; then he applies U* to his particle). Thus 
one has a transformation from two copies of g to two copies of the isotropic 
state gp with an unchanged F: 

g®g^gF®gp- (5.43) 

2. Each party performs the unitary transformation XOR^° on his/her mem- 
bers of the pairs (see Fig. 5.4). The transformation is given by 

t^xOR|a)|6) = |a)|(a -I- 6) mod2) (5.44) 

(the first qubit is called the source, the second qubit the target). They 
obtain some complicated state g of two pairs. 

3. The pair of target qubits is measured locally in the basis |0), |1) and it is 
discarded. If the results agree (success), the source pair is kept and has a 
greater singlet fraction. Otherwise (failure), the source pair is discarded 
too. 

If the results in step 3 agree, the final state g' of the source pair kept can be 
calculated from the formula 

g' = Trn, (P* O \gPt ® I.) , (5.45) 

In this contribution we restrict ourselves to distillation by means of perfect op- 
erations. The more realistic case where there are imperfections in the quantum 
operations performed by Alice and Bob is considered in [328]. 

The quantum XOR gate is the most common quantum two-qubit gate and was 
introduced in [329]. 
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target pair 

Fig. 5.4. Bilateral quantum XOR operation 



where the partial trace is performed over the Hilbert space 7i(t) of the tar- 
get pair, Is is the identity on the space of the source pair (because it was 
not measured), and Pt = |00)(00| -I- |11)(11| acts on target pair space and 
corresponds to the case “results agree” . 

Subsequently, one can calculate the singlet fraction of the surviving pair 
as a function of the singlet fraction of the two initial pairs, obtaining 



= F^ + (l/9)(l-F)^ 

^ ’ F2 + (2/3)F(l -F) + (5/9)(l - F)^ ' 



(5.46) 



Since the function F{F') is continuous, F'{F) > F for F > 1/2 and F'(l) = 
1, we obtain the result that by iterating the procedure, Alice and Bob can 
obtain a state with arbitrarily high F. Of course, the larger F is required 
to be, the more pairs must be sacrificed, and the less the probability p of 
success is. Thus if Alice and Bob start with some and would like to 
end up with some higher Font, the number of final pairs will be on average 
k = np/2\ where I and p depend on Tin and Font, and denote the number 
of iterations of the function F'{F) required to reach Font starting from Fin, 
and the probability of a string of I successful operations, respectively. 

The above method allows one to obtain an arbitrarily high F, but the 
asymptotic rate is zero. However, if F is high enough so that 1 — S' > 0, where 
S is the von Neumann entropy of the state g, then there exists a protocol 
(called hashing) that gives a nonzero rate [66]. We shall not describe this 
protocol here, but we note that for any state with F > 1/2 Alice and Bob 
can start by using the recurrence method to obtain 1 — S > 0, and then 
apply the hashing protocol. This gives a nonzero rate for any state with 
F > 1/2. This means that quantum information can be transmitted via a 
depolarizing channel (5.42) only if p > 1/3. Indeed, one can check that if 
Alice send one of the particles from a pair in a state tjj+ via the channel 
to Bob, then the final state shared by them will be an isotropic one with 
F > 1/2. By repeating this process, Alice and Bob can obtain many such 
pairs. Then distillation will allow them to use the pairs for asymptotically 
faithful quantum communication. 

All Entangled Two-Qubit States are Distillable. As was mentioned in 
Sect. 5.2.4, there exist entangled two-qubit states with T < 1/2, so that no 
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product unitary transformation can produce F > 1/2. Thus the BBPSSW 
protocol cannot be applied to all entangled two-qubit states. We shall show 
below that, nevertheless, all such states are distillable [303]. It was possible 
to solve the problem mainly because of the characterization of the entangled 
states as discussed in Sect. 5.2.3. 

Since we are not interested in the value of the asymptotic rate, it suf- 
fices to show that by starting with pairs in an entangled state, Alice and 
Bob are able to obtain a fraction of them in a new state with F > 1/2 (and 
then the BBPSSW protocol will do the job). Our main tool will be the so- 
called filtering operation [326, 330] , which involves generalized measurement 
performed by one of the parties (say, Alice) on individual pairs. This mea- 
surement consists of two outcomes {1,2}, associated with operators Wi and 
IV 2 satisfying 

Wl Wi + W 2 = Ia (5.47) 

(1^4 and Is denote identities on Alice’s and Bob’s systems, respectively). 
After such a measurement, the state becomes 

q^-W^®IbqW}®Ib, i=l,2 (5.48) 

Pi 

with probability pi = Tr(WigW-). The condition (5.47) ensures pi + P 2 = 1- 
Now Alice will be interested only in one outcome (say, 1). If this outcome 
occurs, Alice and Bob keep the pair; otherwise they discard it (this requires 
communication from Alice to Bob). Then we are only interested in the op- 
erator Wl- If its norm does not exceed 1, one can always find a suitable 
W 2 such that the condition (5.47) is satisfied. Now, since neither the form 
of the final state (5.48) nor the fact whether pi is zero or not depends on 
the positive factor multiplying Wi, we are free to consider completely ar- 
bitrary filtering operators Wi. In conclusion, for any entangled state g we 
must find an operator W such that the state resulting from the filtering 
W ^Ig <E> I/Tr(W 0 1 p <E> I) has F > 1/2. Consider then an arbitrary 
two-qubit entangled state g. From Theorem 5.2, we know that g"^^ is not a 
positive operator, and hence there exists a vector ip for which 

{ip\g'^^\iP) <Q . (5.49) 

Now let us note that any vector </> of a d 0 d system can be written as 
4> = (g) lip+, where is some operator. Indeed, write (p in a product 

basis: ip = j=i ]j). Then the matrix elements of the operator A^ 

are given by {i\A^\j) = Vdaij (in our case, d = 2). Therefore the formula 
(5.49) can be rewritten in the form 

Tr[{Al®IgA^®lf^ P+] <0 . 



(5.50) 
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Using the identity Tr B = Tr , valid for any operators A, B, and 

the fact that = 1/dV (where V is the flip operator, see Sect. 5.2.4), we 
obtain 



Tr[(A|^(g)I^7 7l,^(g)I)U] <0 . (5.51) 

We conclude that A^^®lgA^®\ cannot be equal to the null operator, and 
hence we can consider the following state: 

Tr(y4^ 0 Ip 0 I) 



Now it is clear that the role of the Alter W will be played by the operator 
We shall show that (^_|p|'!/i_) > 1/2, where ip- = (|01) — |10))/-\/2. Then a 
suitable unitary transformation by Alice can convert g into a state g' with 
F > 1/2. 

From the inequality (5.51), we obtain 

Tr pU < 0 . (5.52) 

If we use the product basis |1) = |00), |2) = |01), |3) = |10), |4) = |11), the 
inequality (5.52) can be written as 

011 + 044 + 023 + 032 < 0 . (5.53) 

The above inequality, together with the trace condition Tr p = ga = 1, 
gives 



{lp-\g\lp-) = -(p 22 + P33 - 023 - 032) > ^ ' 



(5.54) 



To summarize, given a large supply of pairs, each in an entangled state 
p, Alice and Bob can distill maximally entangled pairs in the following way. 
First Alice applies a Altering determined by the operator W = described 
above. Then Alice and Bob obtain, on average, a supply of np surviving pairs 
in the state p (here p = Tr W 0 1 p 0 1 is the probability that the outcome 
of Alice’s measurement will be the one associated with the operator W^). Now 
Alice applies an operation iuj, to obtain a state with F > 1/2. Then they 
can use the BBPSSW protocol to distill maximally entangled pairs that are 
useful for quantum communication. Note that we have assumed that Alice 
and Bob know the initial state of the pairs. It can be shown that, if they do 
not know, they still can do the job (in the two-qubit case) by sacrificing 
pairs to estimate the state (P. Horodecki, unpublished). 

The above protocol can easily be shown to work in the 2 0 3 case. The 
protocol can be also fruitfully applied for the system 20n if the state is NPT 
[331]. 
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5.3.3 Examples 



Consider the state (5.36) from Sect. 5.2.4, g = p\'tp-){ip- \ + (1 — p)|00)(00|. 
It is entangled for all p > 0. In matrix notation we have 



' 1 -p 0 0 o' 




'1-p 0 0 -f 


0 f -fo 


- 


0 §00 


0 -f f 0 


1 Q 


0 0 f 0 


0 0 0 0 




-f 0 0 0 



The negative eigenvalue of is A_ = 1/2^1— p — y^(l — pY + ^^nd 

the corresponding (unnormalized) eigenvector ijj = A_|00) — (p/2) 1 11), and 
hence we can take the filter to be of the form W = diag[A_, —p/2]. The new 
state g resulting from filtering is of the form 



0 



1 

N 



rA^(i-p) 0 

0 

28 

0 ^A_ 

0 0 



0 O' 
^A_ 0 
fAl 0 
0 0 . 



(5.56) 



where N = A?_(l — p) -I- p^/8 -I- X^p/2. Now the overlap with given by 
{ilj-\g\ip-) = (p^/8 -I- y?pj2 — \p^ I2)IN, is greater than 1/2 only if p > 0. 
The new state can be distilled by the BBPSSW protocol. 

Below we shall prove that some states of higher-dimensional systems are 
distillable. We shall do this by showing that some LOCC operation can con- 
vert them (possibly with some probability) into an entangled two-qubit state. 

Distillation of Isotropic State for d ® d System. For F > 1/d, an 
isotropic state can be distilled [281, 313]. If both Alice and Bob apply the 
projector P = 10)(0] -I- ll)(l], where ]0), ]1) are vectors from the local basis, 
then the isotropic state will be converted into a two-qubit isotropic state. 
(Note that the projectors play the role of filters; also, the filtering is successful 
if both Alice and Bob obtain outcomes corresponding to P.) Now, if the initial 
state satisfied F > 1/d then the final state, as a two-qubit state, will have 
F > 1/2. Thus it is entangled and hence can be distilled. 

Distillation and Reduction Criterion. Any state p of a d 0 d system 
that violates the reduction criterion (see Sect. 5.2.4) can be distilled [281]. 
Indeed, take a vector ip for which {ip\0A 0 1 — pit/) < 0. It is easy to see that 
by applying the filter W given hy ip = W 'E> 1'/’+, one obtains a state with 
F > 1/d. Now, the random U ®U* transformations will convert it into an 
isotropic state with the same F. As shown above, the latter state is distillable. 



5.3.4 Bound Entanglement 

In the light of the result for two qubits, it was naturally expected that any 
entangled state could be distilled. It was a great surprise when it became 
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apparent that it was not the case. In [71] it was shown that there exist 
entangled states that cannot be distilled. The following theorem provides a 
necessary and sufficient condition for the distillability of a mixed state [71]. 

Theorem 5.4. A state g is distillable if and only if, for some two-dimen- 
sional projectors P, Q and for some number n, the state P ® Qg®^P ® Q is 
entangled. 

Remarks. (1) Note that the state P 0 Qg®'^P (g) Q is effectively a two-qubit 
one as its support is contained in the ® subspace determined by the 
projectors P, Q. This means that the distillable entanglement is a two-qubit 
entanglement. (2) One can see that the theorem is compatible with the fact 
[326] that any pure state can be distilled. 

As a consequence of this theorem, we obtain the following theorem [71]: 
Theorem 5.5. A PPT state cannot be distilled. 

Proof. We shall give here a proof independent of Theorem 5.4. As a matter 
of fact, we shall show that (i) the set of PPT states is invariant under LOGO 
operations [71] and (ii) it is bounded away from the maximally entangled 
state [311, 332]. Then, since obtain the proof of the 

theorem. To prove (i), note that any LOCC operation can be written as [268] 

^ Ai 0 BipAl 0 b} , (5.57) 

^ i 

where p is a normalization constant interpreted as the probability of real- 
ization of the operation, and the map g ® BigA\ 0 bJ does not 

increase the trace (this ensures p < 1). Suppose now that g is PPT, i.e. 

> 0, and examine partial transposition of the state g'. We shall use the 
following property of partial transposition: 

(A0 5pC'0T>)^® = A0D^pT®C'0ST (5.58) 

for any operators A, B, C, D and g. Then we obtain 

i 

Thus (pO"*"^ is a result of the action of some completely positive map on an 
operator p"^® that by assumption is positive. Then also the operator (p')"^® 
must be positive. Thus a LOCC map does not move outside the set of PPT 
states. 

To prove (ii), let us now show that PPT states can never have a high 
singlet fraction F. Consider a PPT state p of a d 0 d system. We obtain 

Tr pP+ = Tr p'^®P]j^® 



(5.60) 



Mixed-State Entanglement and Quantum Communication 



179 



Now, it is easy to check that = \jdV, where V is the flip operator 
described in Sect. 5.2.4. Note that V is Hermitian and has eigenvalues ±1. 
Since q is PPT then q = is a legitimate state, and the above expression 
can be rewritten in terms of the mean value of the observable V as 

Tr gF+ = ^Tr gV . (5.61) 

a 

The mean value of a dichotomic observable cannot exceed 1, so that we obtain 

F{g) < 1 . (5.62) 

Thus the maximal possible singlet fraction that can be attained by PPT states 
is the one that can be obtained without any prior entanglement between the 
parties. Indeed, a product state |00) has a singlet fraction 1/d (if it belongs 
to the Hilbert space C"^ ® C"^). Consequently, for however large an amount 
of PPT pairs, even a single two-qubit pair with F > 1/2 cannot be obtained 
by LOCC actions. □ 

Now, one can appreciate the results presented in the first part of this con- 
tribution. From Sect. 5.2.3, we know that there exist entangled states that 
are PPT. So far, the question of whether there exist entangled states that 
are PPT has been merely a technical one. At this point, since the above the- 
orem implies that PPT states are nondistillable, we can draw a remarkable 
conclusion: there exist nondistillable entangled states. Since, in the process of 
distillation, no entanglement can be liberated to the useful singlet form, they 
have been called bound entangled (BE) states. Thus there exist at least two 
qualitatively different types of entanglement: apart from the free entanglement 
that can be distilled, there is a bound one that cannot be distilled and seems 
to be completely useless for quantum communication. This discontinuity of 
the structure of the entanglement of mixed states was considered to be pos- 
sible for multipartite systems, but it was completely surprising for bipartite 
systems. It should be emphasized here that the BE states are not atypical in 
the set of all possible states: as we have mentioned in Sect. 5.2.5, the volume 
of the PPT entangled states is nonzero. One of the main consequences of the 
existence of BE states is that it reveals a transparent form of irreversibility 
in entanglement processing. If Alice and Bob share pairs in a pure state, then 
to produce a BE state they need some prior entanglement.^^ However, once 
they have produced the BE states, they are not able to recover the pure en- 
tanglement from them. It is entirely lost. This is a qualitative irreversibility, 
which is probably a source of the quantitative irreversibility [66, 266] that is 
due to the fact that we need more pure entanglement to produce some mixed 
states than we can then distill back from them [269, 334].^^ 

This was rigorously proved recently in [333]. 

In fact, the existence of both kinds of irreversibility has not been rigorously 
proved so far (see [335]). The proof of quantitative irreversibility in [336] turned 
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To analyse the phenomenon of bound entanglement, one needs as many 
examples of BE states as possible. Hence there is a very exciting physical 
motivation for the search for PPT entangled states. In Sect. 9 we discussed 
different methods of searching . As a result we obtained a couple of exam- 
ples of BE states via the separability criterion given by Theorem 5.3, from 
the mathematical literature on nondecomposable maps and via unextendible 
product bases. 

The examples produced via UPBs are extremely interesting from the phys- 
ical point of view. This is because a UPB is not only a mathematical object: 
as shown in [273], it produces a very curious physical effect [120] called “non- 
locality without entanglement”. Namely, suppose that Alice and Bob share 
a pair in one of the states from the UPB, but they do not know which 
state this is. It appears that by LOCC operations (with finite resources), 
they are not able to read the identity of the state. However, if the particles 
were together, then, since the states are orthogonal, they could be perfectly 
distinguished from each other. Thus we have a highly nonclassical effect pro- 
duced by an ensemble of separable states. On the other hand, the BE state 
associated with the given UPB (the uniform state on the complementary 
subspace, see (5.26)) presents opposite features: it is entangled but, since its 
entanglement is bound, it ceases to behave in a quantum manner. Moreover, 
in both situations we have a kind of irreversibility. As was mentioned, BE 
states are a reflection of the formation-distillation irreversibility: to create 
them by LOCC from singlet pairs, Alice and Bob need a nonzero amount of 
the latter. However, once they are created, there is no way to distill singlets 
out of them. On the other hand, a UPB exhibits preparation-measurement 
irreversibility: any of the states belonging to the UPB can be prepared by 
LOCC operations, but once Alice and Bob forget the identity of the state, 
they cannot recover it by LOCC. This surprising connection between some 
BE states and bases that are not distinguishable by LOCC implies many in- 
teresting questions concerning the future unification of our knowledge about 
the nature of quantum information. 

Finally, we shall mention a result concerning the rank of the BE state. In 
numerical analysis of BE states (especially their tensor products), it is very 
convenient to have examples with low rank. However, in [282] the following 
bound on the rank of the BE state g was derived: 

R{g) > ma,x{R{gA),R{gB)} ■ (5.63) 

(Recall that R{g) denotes the rank of g.) Note that the above inequality 
is nothing but the entropic inequality (5.10) with the entropy (5.11). Thus 
it appears that the latter inequality is a necessary condition not only for 
separability, but also for nondistillability. The proof is based on the fact [281] 

out to be invalid: it was based on a theorem [311] on the additivity of the relative 
entropy of Werner states. However, an explicit counterexample to this theorem 
was provided in [337]. 
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that any state violating a reduction criterion (see Sects. 5.2.4 and 5.3.3) can 
be distilled. It can be shown that, if a state violates the above equation, then 
it must also violate a reduction criterion, and hence can be distilled. Then 
it follows that there does not exist any BE state of rank two [282]. Indeed, 
if such a state existed, then its local ranks could not exceed two. Hence the 
total state would be effectively a two-qubit state. However, from Sect. 19 we 
know that two-qubit bound entangled states do not exist. 

5.3.5 Do There Exist Bound Entangled NPT States? 

So far we have considered BE states that arise from Theorem 5.5, which says 
that the NPT condition is necessary for distillability. As mentioned in Sect. 
19, for 2 0 n systems all NPT states can be distilled [331], and hence the 
condition is also sufficient in this case. However, it is not known whether 
it is sufficient in general. A necessary and sufficient condition is given by 
Theorem 5.4. To find if this condition is equivalent to the PPT one, it must 
be determined whether there exists an NPT state such that, for any number 
of copies n, the state will not have an entangled two-qubit “substate” 
(i.e. the state P 0 Qg^'^P 0 Q). In [281] it was pointed out that one can 
reduce the problem by means of the following observation. 

Proposition 5.1. The following statements are equivalent: 

1. Any NPT state is distillable. 

2. Any entangled Werner state (5.30) is distillable. 

Proof. The proof of the implication (1) => (2) is immediate, as Werner states 
are entangled if and only if they are NPT. If we can distill any NPT state, 
then also Werner entangled states are distillable. To obtain (2) => (1), note 
that the reasoning of Sect. 19, from (5.49) to (5.52), is insensitive to the 
dimension d of the problem. Consequently, from any NPT state, a suitable 
filtering produces a state g satisfying Tr gV < 0. As mentioned in Sect. 5.2.4, 
the parameter Tr gV is invariant under U ®U twirling, so that by applying 
the latter (which is an LOCC operation), Alice and Bob obtain a Werner 
state pw satisfying Tr pwP < 0. Thus any NPT state can be converted by 
means of LOCC operations into an entangled Werner state, which completes 
the proof. □ 

The above proposition implies that to determine whether there exist NPT 
bound entangled states, one can restrict oneself to the family of Werner states, 
which is a one-parameter family of very high symmetry. Even after such a 
reduction of the problem, the latter remains extremely difficult. In [310, 338] 
the authors examine the nth tensor power of Werner states (in [338] a larger, 
two-parameter family is considered). The results, though not conclusive yet, 
strongly suggest that there exist NPT bound entangled states (see Fig. 5.5). 

Thus it is likely that the characterization of distillable states is not as sim- 
ple as reduction to a NPT condition. The possible existence of NPT bound 



182 Michal Horodecki, Pawel Horodecki and Ryszard Horodecki 



PPT NPT 




(^) PPT 



NPT 




Fig. 5.5. Entanglement and distillability of mixed states for 2®2 and 2(g)3 systems 
(a) and for higher dimensions (b). The area filled with diagonal lines denotes the 
hypothetical set of bound entangled NPT states 



entanglement would make the total picture much more obscure (and hence 
much more interesting) . Among others, there would arise the following ques- 



Mixed-State Entanglement and Quantum Communication 



183 



tion: for two distinct BE states q\ and Q 2 , is the state q\ 0 Q 2 also BE? (If 
BE was equivalent to PPT, this question would have an immediate answer 
“yes”, because the PPT property is additive, i.e. if two states are PPT, then 
so is their tensor product [284]). Recently, a negative answer to this question 
was obtained in [339] in the case of a multipartite system. For bipartite states 
the answer is still unknown. 

5.3.6 Example 

Consider the family of states (5.37) considered in Sect. 5.2.4. One obtains the 
following classification: g is 

• separable for 2 < a < 3 

• bound entangled (BE) for 3 < a < 4 

• free entangled (EE) for 4 < a < 5 . 

The separability was shown in Sect. (5.2.4). It was also shown there that, for 
3 < a < 4, the state is entangled and PPT. In this case we conclude that it is 
BE. For a > 4, Alice and Bob can apply local projectors P = |0)(0| -I- |1)(1|, 
obtaining an entangled two-qubit state. Hence the initial state is EE in this 
region of a. 

5.3.7 Some Consequences of the Existence 
of Bound Entanglement 

A basic question that arises in the context of bound entanglement is: what 
is its role in quantum information theory? We shall show in the following 
sections that even though it is indeed a very poor type of entanglement, it 
can produce a nonclassical effect, enhancing quantum communication via a 
subtle activation-like process [340]. This will lead us to a new paradigm of 
entanglement processing that extends the “LOCC paradigm” . Moreover, the 
existence of bound entanglement means that there exist stronger limits on 
the distillation rate than were expected before. We shall report these and 
other consequences in the next few subsections. 

Bound Entanglement and Teleportation. By definition, BE states can- 
not be distilled, and hence it is impossible to obtain faithful teleportation via 
such states. However, it might be the case that the transmission fidelity of im- 
perfect teleportation might still be better than that achievable with a purely 
classical channel, i.e. without sharing any entanglement (this is a way of re- 
vealing a manifestation of quantum features of some mixed states [264]).^^ 
Initial searches produced a negative result [314]. Here we present more gen- 
eral results, according to which the most general teleportation scheme cannot 
produce better than classical fidelity if Alice and Bob share BE states. 

For a detailed study of the standard teleportation scheme via a mixed two-qubit 
state, see [341]. The optimal one-way teleportation scheme via pure states was 
obtained in [342]. 
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General Teleportation Scheme. Teleportation, as originally devised [261] , 
is a way of transmitting a quantum state by use of a classical channel and a bi- 
partite entangled state (pure singlet state) shared by Alice and Bob. The most 
general scheme of teleportation would then be of the following form [343]. 
There are three systems: that of the input particle, the state of which is to 
be teleported (we ascribe to this system the Hilbert space T-La'), and two sys- 
tems that are in the entangled state qab (with Hilbert space = Ha^'Hb)- 
For simplicity we assume that dim TLa> = dim TLa = dim TLb = d. The initial 
state is 

\i>A'){'>pA'\ ® QAB , 

where ipA' is the state to be teleported (unknown to Alice and Bob) . Now Al- 
ice and Bob perform some trace-preserving LOCC operation (trace-preserving, 
because teleportation is an operation that must be performed with probabil- 
ity 1). The form of the operation depends on the state qab that is known to 
Alice and Bob, but is independent of the input state ipA' because that state 
is unknown. Now the total system is in a new, perhaps very complicated 
state qa'AB- The transmitted state is given by It: a' a{qa' ab)- The overall 
transmission stages are the following: 

Ip A’ \lpA'){'ipA'\ ® QAB A{\'lpA'){lpA’\ ® QAb) Tva'AQA’AB = QB ■ 

The transmission fidelity is now defined by 

/ = ii’A'lQBl'fpA') , 

where the average is taken over a uniform distribution of the input states 
ipA'-'^'^ In the original teleportation scheme (where qab is a maximally en- 
tangled state), the state qb is exactly equal to the input state, so that / = 1. 
If Alice and Bob share a pair in a separable state (or, equivalently, share no 
pair), then the best one can do is the following: Alice measures the state and 
sends the results to Bob [264]. Since it is impossible to find the form of the 
state when one has only a single system in that state [345] (it would con- 
tradict the no-cloning theorem [346] (see Sect. 1)), the performance of such 
a process will be very poor. One can check that the best possible fidelity is 
/ = 2/(d -I- 1). If the shared pair is entangled but is not a pure maximally 
entangled state, we shall obtain some intermediate value of /. 

Optimal Teleportation. Having defined the general teleportation scheme, 
one can ask about the maximal fidelity that can be achieved for a given state 

Note that the fidelity so defined is not a unique criterion of the performance of 
teleportation. For example, one can consider a restricted input: Alice receives one 
of two nonorthogonal vectors with some probabilities [344] . Then the formula for 
the fidelity would be different. In general, the fidelity is determined by a chosen 
distribution over input states. 
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QAB within the scheme. Thus, for a given qab we must maximize / over all 
possible trace-preserving LOCC operations. The problem is, in general, ex- 
tremely difficult. However, the high symmetry of the chosen fidelity function 
allows one to reduce it in the following way. It has been shown [343] that the 
best Alice and Bob can do is the following. They first perform some LOCC 
action that aims at increasing F{qab) as much as possible. Then they per- 
form the standard teleportation scheme, via the new state g'j^g (just as if it 
were the state P+). The fidelity obtained is given by 

/max = , (5.64) 

where Tjnax = F{q'ab) maximal F that can be obtained by trace- 

preserving LOCC actions if the initial state is qab- 

Teleportation Via Bound Entangled States. According to (5.64), to 
check the performance of teleportation via BE states of a d ® d system, we 
should find the maximal F attainable from BE states via trace-preserving 
LOCC actions. As was argued in Sect. 5.3.4, a BE state subjected to any 
LOCC operation remains BE. Moreover, the singlet fraction F of a BE state 
of a dOd system satisfies F <l/d (because states with F > \/d are distillable, 
as shown in Sect. 5.3.3). We conclude that, if the initial state is BE, then 
the highest F achievable by any (not only trace-preserving) LOCC actions 
is F = 1/d. However, as we have argued, this gives a fidelity / = 2/{d+ 1), 
which can be achieved without entanglement. Thus the BE states behave 
here like separable states - their entanglement does not manifest itself. 

Activation of Bound Entanglement. Here we shall show that bound 
entanglement can produce a nonclassical effect, even though the effect is a 
very subtle one. This effect is the so-called activation of bound entanglement 
[340]. The underlying concept originates from a formal entanglement-energy 
analogy developed in [71, 269, 325, 335, 347]. One can imagine that the bound 
entanglement is like the energy of a system confined in a shallow potential 
well. Then, as in the process of chemical activation, if we add a small amount 
of extra energy to the system, its energy can be liberated. 

In our case, the role of the system is played by a huge amount of bound 
entangled pairs, while that of the extra energy is played by a single pair 
that is free entangled. More specifically, we shall show that a process called 
conclusive teleportation [348] can be performed with arbitrarily high fidelity 
if Alice and Bob can perform joint operations over the BE pairs and the 
EE pair. We shall argue that it is impossible if either of the two elements is 
lacking. 

Conclusive Teleportation. Suppose that Alice and Bob have a pair in a 
state for which the optimal teleportation fidelity is /q. Suppose, further, that 
the fidelity is too poor for some of Alice and Bob’s purposes. What they can 
do to change the situation is to perform a so-called conclusive teleportation. 
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Namely, they can perform some LOCC operation with two final outcomes 0 
and 1. If they obtain the outcome 0, they fail and decide to discard the pair. 
If the outcome is 1 they perform teleportation, and the fidelity is now better 
than the initial /q. Of course, the price they must pay is that the probability 
of success (outcome 1) may be small. The scheme is illustrated in Fig. 5.6. 
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Bob 
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entangled pair 



LOCC 
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(success) p 



teleportation 
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Fig. 5.6. Conclusive teleportation. Starting with a weakly entangled pair, Alice 
and Bob prepare with probability p a strongly entangled pair and then perform 
teleportation 



A simple example is the following. Suppose that Alice and Bob share a 
pair in a pure state = a|00) + &|11) which is nearly a product state (e.g. a 
is close to 1). Then the standard teleportation scheme provides a rather poor 
fidelity / = 2(l + a6)/3 [341, 349]. However, Alice can subject her particle to 
a filtering procedure [326, 330] described by the operation 

A = VF(-)IF1' + F(-)W , (5.65) 



where W = diag(6,a), V = diag(a, 6). Here the outcome 1 (success) cor- 
responds to the operator W. Indeed, if this outcome is obtained, the state 
collapses to the singlet state 






J\W0hp\\ 



^(| 00 ) + | 11 )) . 



(5.66) 



Then, in this case, perfect teleportation can be performed. Thus, if Alice 
and Bob teleported directly via the initial state, they would obtain a very 
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poor performance. Now they have a small but nonzero chance of performing 
perfect teleportation. 



LOCC 



(failure) 



1-p 



(success) p 






Fig. 5.7. Conclusive increase of the singlet fraction. Alice and Bob obtain, with 
a probability p of success, a state with a higher singlet fraction than that of the 
initial state 



Similarly to the usual form of teleportation, conclusive teleportation can 
be reduced to conclusively increasing F (illustrated in Fig. 5.7), followed by 
the original teleportation protocol. If in the first stage Alice and Bob obtain 
a state with some F, then the second stage will produce the corresponding 
fidelity / = {Fd + 1 ) /{d + 1). Thus we can restrict our consideration to con- 
clusively increasing the singlet fraction. The latter process was developed in 
[343, 350]. An interesting peculiarity of conclusively increasing the singlet 
fraction is that sometimes it is impossible to obtain F = 1, but it is still 
possible to make F arbitrarily close to 1. However, if F ^ 1, then the prob- 
ability of success tends to 0, so that, indeed, it is impossible to reach F = 1 
[343]. 

Activation Protocol. Suppose that Alice and Bob share a single pair of 
spin-1 particles in the following free entangled mixed state: 

giree = g{F) = F\i;+){^P+\ + {l-F)a+, 0<F<1, (5.67) 

where a± are separable states given by (5.38). It is easy to see that the 
state (5.67) is free entangled. Namely, after action of the local projections 
(]0)(0l -I- ]l)(ll) 0 (|0)(0] -I- ]l)(ll), we obtain an entangled 2 (g) 2 state (its 
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entanglement can be revealed by calculating partial transposition). Thus, ac- 
cording to Theorem 5.4, the state (5.67) is FE. By complicated considerations 
one can show [343] that there is a threshold Fq < 1 that cannot be exceeded 
in the process of conclusively increasing the singlet fraction. In other words, 
Alice and Bob have no chance of obtaining a state p' with F{g') > Fq (we 
do not know the value Fq, we only know that such a number exists). 

Suppose now that Alice and Bob share, in addition, a very large number 
of pairs in the following BE state (the one considered in Sect. (5.3.6)): 

5 — a , 

CTa = y|'0+)(V'-el + ycr+ H 

As stated in Sect. 5.3.6, for 3 < a < 4 the state is BE. As we know, there is 
no chance of obtaining even a pair with F > 1/3 from BE pairs of a 3 0 3 
system. Now, it turns out that, if Alice and Bob have both an FE pair and 
the BE pairs, they can apply a simple protocol to obtain an F arbitrarily 
close to 1. Thus, owing to the connection between conclusive increasing of 
the singlet fraction and conclusive teleportation, the fidelity of the latter can 
be arbitrarily close to unity only if both an FE pair and BE pairs are shared. 

The protocol [340] is similar to the recurrence distillation protocol de- 
scribed in Sect. 19. It is an iteration of the following two steps: 

(i) Alice and Bob take the free entangled pair, in the state gfree(F), and one 
of the pairs, which is in the state aa. They perform the bilateral XOR 
operation t/exOR = Uxor ® Uxor, each of them treating the member 
of the free entangled pair as a source and the member of the bound 
entangled pair as a target. 

(ii) Alice and Bob measure the members of the source pair in the basis 
|0), |1), |2). Then they compare their results via classical communication. 
If the compared results differ from one another, they have to discard both 
pairs, and then the trial of the improvement of F fails. If the results agree, 
then the trial succeeds and they discard only the target pair, coming back 
with (as we shall see) an improved source pair to the first step (i). 

After some algebra, one can see that the success in the step (ii) occurs with 
a nonzero probability 



2F+(l-F)(5-a) 

Ff^f' = , 

Here we need the quantum XOR gate not for two qubits, as in Sect. 19, but for 
two qutrits (three-level systems). A general XOR operation for a d ® d system, 
which was used in in [281, 351], is defined as 

Uxoii\a)\b) = |a)|(6 -I- a)modd) , (5.69) 

where the initial states ja) and |6) correspond to the source and target states, 
respectively. 
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Fig. 5.8. Liberation of bound entanglement. The singlet fraction of the FE state is 
plotted versus the number of successful iterations of (i) and (ii), and the parameter 
a of the state Qa of the BE pairs used. The initial singlet fraction of the FE pair is 
taken as Fin = 0.3 (This figure is reproduced from Phys. Rev. Lett. 82, 1056 (1999) 
by permission of the authors) 




leading to the transformation g{F) — > g{F'), where the improved fidelity is 



F'{F) 



2F 

2F + {I - F){5 - a) ■ 



(5.71) 



If a > 3, then the above continuous function of F exceeds the value of F 
on the whole region (0, 1). Thus the successful repetition of the steps (i) and 
(ii) produces a sequence of source fidelities F„ ^ 1. In Fig. 5.8 we have plotted 
the value of F obtained versus the number of iterations of the protocol and 
the parameter a. For a < 3 the singlet fraction goes down: separable states 
cannot help to increase it. We can see the dramatic qualitative change at the 
“critical”^® point that occurs at the borderline between separable states and 
bound entangled ones (a = 3). On the other hand, it is surprising that there 

The term “critical” that we have used here reflects the rapid character of the 
change (see [307] for a similar “phase transition” between separable and FE 
states). On the other hand, the present development of thermodynamic analogies 
in entanglement processing [71, 268, 325, 335, 347] allows us to hope that in 
future one will be able to build a synthetic theory of entanglement based on 
thermodynamic analogies: then the “critical” point would become truly critical. 
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is no qualitative difference between the behavior of BE states (3 < a < 4) 
and FE states (4 < a < 5). Here the change is only quantitative, while the 
shape of the corresponding curves is basically the same. To our knowledge, 
this is the only effect we know about where bound entanglement manifests its 
quantumness. Since the effect is very subtle, one must conclude that bound 
entanglement is essentially different from free entanglement and is enormously 
weak. For recent results on the activation effect in the multiparticle cases see 
[353]. 

Entanglement-Enhanced LOCC Operations. The activation effect sug- 
gests that we should extend the paradigm of LOCC operations by including 
quantum communication (under suitable control). Then we obtain entangle- 
ment- enhanced \jOCC (LOCC -I- EE) operations (see [354]). For example, if 
we allow LOCC operations and an arbitrary amount of shared bound entan- 
glement, we obtain the LOCC 4- BE paradigm. One can now ask about the 
entanglement of formation^® and distillation in this regime. Since BE states 
contain entanglement, even though it is very weak, then an infinite amount 
of bound entanglement could make 4?locc-i-be much larger than the usual 
DhOCC- one might expect 4?locc-i-be to be the maximal possible, indepen- 
dently of the input state g [355] (e.g. for two-qubit pairs, we would have 
^LOCC-i-BE = 1 for any state). In [305, 356] it was shown that this is im- 
possible. The argument of [305] is as follows. First, the authors recall that 
Dlocc < ^LOCC [®®]- Otherwise, it would be possible to increase entangle- 
ment by means of LOCC actions. Indeed, suppose that for some state g we 
have Di,occ{q) > ^LOCc(tl)- Then Alice and Bob could take n two-qubit 
pairs in a singlet state and produce n/Ef^QQQ pairs of the state g. Then they 
could distill n{D\^occ / EJ^ occ) singlets, which would be greater number than 
n. A similar argument is applied to LOCC-I- BE actions: the authors show 
that it is impossible to increase the number of singlet pairs by LOCC -I- BE 
actions, and conclude that Olocc-i-be < T1locc-i-be- On the other hand, ob- 
viously we have A'locc-i-be — -^locc • Combining the inequalities, we obtain 
the result that L?locc-i-be is bounded by the usual entanglement of formation 
ElocC’ which is maximal only for singlet-type states. A different argument 
in [356] is based on the results of Rains [311] on a bound for the distillation 
of entanglement (see Sect. 28). Thus, even if an infinite amount of BE pairs 
is employed, LOCC -I- BE operations are not enormously powerful. However, 
it is still possible that they are better than LOCC operations themselves, i.e. 
we can conjecture that Olocc-i-be(0) > L?locc(i?) for some states g. 

Bounds for Entanglement of Distillation. Bound entanglement is an 
achievement in qualitative description; however, as we could see in the previ- 
ous section, it also has an impact on the quantitative approach. Here we shall 

In the multipartite case, two other effects have recently been found [339, 352). 

The entanglement of formation E^qqq{q) of a state g is the number of input 

singlet pairs per output pair needed to produce the state g by LOCC operations 
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see that it has helped to obtain a strong upper bound for the entanglement 
of distillation D (recall that the latter has the meaning of the capacity of a 
noisy teleportation channel constituted by bipartite mixed states, and hence 
is a central parameter of quantum communication theory) . 

The first upper bound for D was the entanglement of formation [66], 
calculated explicitly for two-qubit states [301]. However, a stronger bound 
has been provided in [269] (see also [334]). It is given by the following measure 
of entanglement [268, 269] based on the relative entropy: 

Eyy>{q) = inf S'(£'|ct) , (5.72) 

(7 

where the infimum is taken over all separable states a. The relative entropy 
is defined by 

^(plcr) = Tr £<logp — Tr glogo . 

Vedral and Plenio provided a complicated argument [269] showing that Tlyp is 
an upper bound for D{g), under the additional assumption that it is additive. 
Even though we still do not know if it is indeed additive. Rains showed [311] 
that it is a bound for D even without this assumption. He also obtained a 
stronger bound by use of BE states (more precisely, PPT states) . It appears 
that, if the infimum in (5.72) is taken over PPT states (which are bound 
entangled), the new measure Er is a bound for the distillable entanglement, 
too. However, since the set of PPT states is strictly greater than the set 
of separable states, the bound is stronger. For example, the entangled PPT 
states have zero distillable entanglement. Since they are not separable, Eyp 
does not vanish for them, and hence the evaluation of D by means of Eyp is 
too rough. The Rains measure vanishes for these states. 

We will not provide here the original proof of the Rains result. Instead 
we demonstrate a general theorem on bounds for distillable entanglement 
obtained in [357], which allows a major simplification of the proof of the 
result. 

Theorem 5.6. Any function B satisfying the conditions (a)-(c) below is an 
upper bound for the entanglement of distillation: 

(a) Weak monotonicity: B{g) > B[A{g)], where A is a superoperator realizable 
by means of LOCC operations. 

(b) Partial subadditivity: B{g®'^) < nB{g). 

(c) Continuity for an isotropic state g{F,d): suppose that we have a sequence 

of isotropic states g{Fd,d) (see Sect. 5.2.4, (5.34)) such that — > 1 if d ^ 

oo. Then we require 

lim T^B[g{Fa,d)] 1 . 
d^oo log cl 



(5.73) 
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Remark. If, instead of LOCC operations, we take another class C of oper- 
ations including classical communication in at least one direction (e.g. the 
LOCC -I- BE operations mentioned previously), the proof, mutatis mutandis, 
also applies. (The condition (a) then involves the class C) 

Proof. The main idea of the proof is to exploit the monotonicity condition. 
We shall show that if D were greater than B then, during the distillation pro- 
tocol, the function B would have to increase. But this cannot be so, because 
distillation is a LOCC action, and hence B would violate the assumption (a). 
By subadditivity, we have 

B{q) > . (5.74) 

n 

Distillation of n pairs aims at obtaining k pairs, each in nearly a singlet state. 
The asymptotic rate is lim/c/n. It was shown [3II] that one can equally 
well think of the final d <Si d system as being in a state close to The 
asymptotic rate is now lim(logd)/n. Then the only relevant parameters of 
the final state gout are the dimension d and the fidelity P(£»out)- Thus the 
distillation protocol can be followed by C/017* twirling, producing an isotropic 
final state g{d,F) (see Sect. 5.2.4). By condition (a), distillation does not 
increase B, and hence 

-B{g^^)>-B[g{Fd^,d„)]. (5.75) 

n n 

Now, in the distillation process P ^ 1, and if we consider an optimal protocol, 
then (logd)/n ^ D. Hence, by condition (c), the right-hand side of the 
inequality tends to D{g). Thus we obtain the result that B{g) > D{g). □ 

We should check, whether the Vedral-Plenio and Rains measures satisfy 
the assumptions of the theorem. Subadditivity and weak monotonicity are 
immediate consequence of the properties of the relative entropy used in the 
definition of Pr (subadditivity was proved in [268], and weak monotonicity 
in [269]). The calculation of Pr for an isotropic state is a little bit more 
involved, but by using the high symmetry of the state, it was found to be 
[311] EYp[g{F,d)] = En[giF,d)]=logd+FlogF+{l-F)log[il-F)/id-l)]. 
Evaluating now this expression for large d, we easily find that the condition 
(c) is satisfied. The argument applies without any change to the Rains bound. 

Finally, let us note that the Rains entanglement measure attributes no 
entanglement to some entangled states (the PPT entangled ones). Normally 
we would require that a natural postulate for an entanglement measure would 
be that the entanglement measure should vanish if and only if the state is 
separable. However, then we would have to remove distillable entanglement 
from the set of measures. Indeed, the distillable entanglement vanishes for 
some manifestly entangled states - bound entangled ones. Now the problem 
is: should we keep the postulate, or keep D as a good measure? 
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It is reasonable to keep D as & good measure, as it has a direct physical 
sense: it describes entanglement as a resource for quantum communication. 
If it is not a measure, then we must conclude that we are not interested in 
measures. Consequently, we adopt as a main “postulate” for an entangle- 
ment measure the following statement: “Distillable entanglement is a good 
measure”. So we must abandon the postulate. The apparent paradox can be 
removed by realizing that we have different types of entanglement. Then a 
given state, even though it is entangled, may not contain some particular type 
of entanglement, and the measure that quantifies that type will attribute no 
entanglement to the state. 



5.4 Concluding Remarks 

In contrast to the case of pure states, the problem of mixed-state entangle- 
ment is “nondegenerate” in the sense that the various scalar and structural 
separability criteria are not equivalent. There is a fundamental connection 
between entanglement and positive maps, represented by Theorem 5.1. How- 
ever, there is still a problem of turning it into an operational criterion for 
higher-dimensional systems. Recently [358, 359] the question was reduced to 
a problem of investigation of the so-called “edge” PPT entangled states, as 
well as of the positive maps and entanglement witnesses detecting their en- 
tanglement. Some operational criteria for low-rank density matrices (and also 
for the multiparticle case) have been worked out in [360]. 

It is remarkable that the structure of entanglement reveals a discontinuity. 
There are two qualitatively different types of entanglement: distillable, “free” 
entanglement, and “bound” entanglement, which cannot be distilled. All the 
two-qubit entangled states are free entangled. Moreover, a free entangled state 
in any dimension must have some features of two-qubit entanglement. Bound 
entanglement is practically useless for quantum communication. However, it 
is not a marginal phenomenon, as the volume of the set of BE states in the 
set of all states for finite dimension is nonzero. 

The discovery of activation of bipartite bound entanglement suggested 
[340] the nonadditivity of the corresponding quantum communication chan- 
nels,^® in the sense that the distillable entanglement D(£»be ® 0ef) could 
exceed D{qy-e) for some free entangled state gpE and bound entangled state 
£<be- Quite recently it has been shown [339] that, in the multipartite case, two 
different bound entangled states, if tensored together, can make a distillable 
state: ® > D(pgg) -|- D{q^^) = 0. This new nonclassical effect 

was called superactivation. On the other hand, in [352] it was shown that 
the four-party “unlockable” bound entangled states [361] can be used for re- 
mote concentration of quantum information. It is intriguing that for bipartite 

This could be then reformulated in terms of the so-called binding entanglement 

channels [305, 355]. 
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systems, with the exception of the activation effect, bound entanglement is 
permanently passive. In general, there may be a qualitative difference between 
bipartite bound entanglement and the multipartite form. Still, in the light of 
recent results [362], it is quite possible that bipartite bound entanglement is 
also nonadditive. The very recent investigations of bound entanglement for 
continuous variables [363, 364] also raise analogous questions in this latter 
domain. 

As we have seen, there is a basic connection between bound entanglement 
and irreversibility. As a consequence, it would be interesting to investigate 
some dynamical features of BE. It cannot be excluded that some systems in- 
volving BE states may reveal a nonstandard (nonexponential) decay of entan- 
glement. In general, it seems that the role of bound entanglement in quantum 
communication will be negative: in fact, the existence of BE constitutes a fun- 
damental restriction on entanglement processing. One can speculate that it is 
the ultimate restriction in the context of distillation, i.e. that it may allow one 
to determine the value of the distillable entanglement. Hence it seems impor- 
tant to develop an approach combining BE and the entanglement measures 
involving relative entropy. It also seems reasonable to conjecture that, in the 
case of general distillation processes involving the conversion of mixed states 
[66], the bound entanglement Eb never decreases^° (i.e. AEb > 0) in 
optimal processes. 

The irreversibility inherently connected with distillation encourages us to 
develop some natural formal analogies between mixed-state entanglement 
processing and phenomenological thermodynamics. The construction of a 
“thermodynamics of entanglement” (cf. [325, 335, 347, 365]) would be es- 
sential for a synthetic understanding of entanglement processing. Of course, 
progress in the above directions would require the development of various 
techniques of searching for bound entangled states. 

One of the challenges of mixed-state entanglement theory is to determine 
which states are useful for quantum communication with given additional 
resources. In particular, we still do not know (i) which states are distillable 
under LOCO (i.e. which states are free entangled), and (ii) which states are 
distillable under one-way classical communication and local operations. 

A promising direction for mixed-state entanglement theory is its appli- 
cation to the theory of quantum channel capacity, pioneered in [66]. In par- 
ticular, the methods leading to upper bounds for distillable entanglement 
described in Sect. 28 allow one to obtain upper bounds for quantum channel 
capacities [101] (one of them was obtained earlier [97]). It has been shown 
[101] that the following hypothetical inequality 

i^i(e) >^(i^b)-^(^^) , (5.76) 

The bound entanglement can be quantified [71] as the difference between the 
entanglement of formation and the entanglement of distillation (defined within 
the original distillation scheme): Eb = Ep — D. 



Mixed-State Entanglement and Quantum Communication 195 

where Di{g) is the one-way distillable entanglement,^^ would imply equality 
between the capacity of a quantum channel and the maximal rate of coher- 
ent information [366]. The latter equality would be nothing but a quantum 
Shannon theorem, with coherent information being the counterpart of mu- 
tual information. All the results obtained so far in the domain of quantifying 
entanglement indicate that the inequality is true. However, a proof of the 
inequality has not been found so far. 

Finally, one would like to have a clear connection between entanglement 
and its basic manifestation - nonlocality. One can assume that free entangled 
states exhibit nonlocality via a distillation process [265, 266]. However, the 
question concerning the possible nonlocality of BE states remains open (see 
[367, 368, 369]). 

To answer the above and many other questions, one must develop the 
mathematical description of the structure of mixed-state entanglement. In 
this context, it would be especially important to push forward the mathe- 
matics of positive maps. One hopes that the exciting physics connected with 
mixed-state entanglement that we have presented in this contribution will 
stimulate progress in this domain. 
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Classical messages can be sent only from Alice to Bob during distillation. 
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